Coding Java Expert needed

please check attached task and tell me if you can handle it

School of Computing, Engineering and Mathematics

Assessment Brief

Module Title:	Data structures and operating systems Save Time On Research and Writing Hire a Pro to Write You a 100% Plagiarism-Free Paper. Get My Paper
Module Code:	CI583
Author(s)/Marker(s) of Assignment	Jim Burton

Assignment No:	1
Assignment Title:	Huffman coding and reflective report
Percentage contribution to module mark:	100% of the overall mark
Weighting of component assessments within this assignment:	n/a
Module learning outcomes covered:	1. Assess how the choice of data structures and algorithm design methods impacts the performance of programs. 2. Choose the appropriate data structure and algorithm design method for a specified application. 3. Demonstrate an understanding of the limitations of/merits of an operating system as a manager of normally scarce resources, 4. Describe and propose solutions to issues arising from the interactions between system and/ or user level components.

Assignment Brief and Assessment Criteria
: See pages 2 onwards

Date of issue:	29 October 2021
Deadline for submission:	15:00, 10 January 2022
Method of submission:	e-submission via studentcentral. Submit your programming assignment as a zip file containing both your edited version of the project and your reflective report as a Word document or PDF file.
Date feedback provided	10 February 2022

A copy of your coursework submission may be made as part of the University of Brighton’s and School of Computing, Engineering & Mathematics procedures which aim to monitor and improve quality of teaching. You should refer to your student handbook for details.

All work submitted must be your own (or your team’s for an assignment which has been specified as a group submission) and all sources which do not fall into that category must be correctly attributed. The markers may submit the whole set of submissions to the JISC Plagiarism Detection Service.

CI583 – Assessment 1, Huffman coding implementation and report

Author:
Jim Burton

Version:
1

Date:
16 September 2021

1 Requirement

Download or clone the Java project from the URL below. Implement your solutions to the problems given in the README file in the top level of the project. You must maintain the project structure as it is. If you submit individual Java source files, marks will be deducted. The project contains unit tests which you should use to test your progress. Note that passing the tests may not be a guarantee of correctness or of full marks for correctness. See the slides from this module and the recommended reading for more information on hashtables, hash functions and prime numbers.

1. Huffman coding. Get a copy of the code from
https://github.com/jimburton/huffman-java/
by cloning the repository or downloading a zip file that contains it. Follow the instructions in README.md to complete the implementation.

Implementation assessment criteria: 60% of overall mark

1. Correctness [40 marks]

You can test your progress towards a correct solution by running the unit tests supplied. However, a set of tests which all pass is not a guarantee of full marks in this area. Firstly, it is possible to make unit tests pass by “cooking the books”, i.e. by hard-coding certain values into the application. Secondly, you may create an implementation which “works”, i.e. which passes all tests, but which is not a true Huffman coding because you aren’t following the algorithm correctly.

To achieve 30-40 marks your implementation will be correct in all aspects and show a thorough understanding of the theory behind these data structures and algorithms, as well as showing strong programming skills.

To achieve 20-29 marks your implementation may pass some but not all unit tests but will show an understanding of some aspects of this problem. Methods that you weren’t able to get working may contain pseudo-code in comments, and some credit will be given for this.

If your code does not pass the unit tests and does not show an understanding of the problem you will receive fewer than 20 marks, but I will award credit for code that shows some understanding of the theory and implementation of these data structures and algorithms.

2. Coding style [20 marks]

To achieve these marks your code must conform to the Java style guidelines linked to on studentcentral, and show good ability with the Java programming language. That means that your code will be consistently and conventionally formatted, comments used appropriately and not over-used, variables named appropriately, and the code will be well-structured. You may have shown independent learning by using features of the language not discussed in the module to structure the code more elegantly.

2. Report. Write a report not longer than four A4 sides that includes at least two sections. In Section 1, describe the complexity of your encode and decode methods and discusses the applications for this kind of compression technique. (Note that to describe the complexity of these methods you need to consider the complexity of all methods called within them.) Comment on what you would need to add to this implementation in order to produce a fully working compression/decompression tool that stores compressed data in binary codes?

In Section 2 of the report, discuss the application of some of the data structures and algorithms you learned about in the first part of the module in the context of operating systems. This will require independent learning and research on your part. Some of the best places to find information will be the lecture slides from this module and the books on operating systems that can be found in the library, especially Operating Systems in Depth (Doeppner) and Modern Operating Systems or Operating Systems: Design and Implementation (both by Tanenbaum et al). A few non-exhaustive examples of the applications you could discuss are the use of the stack and stack frames in assembly code, the use of a binary algorithm in the Buddy System, the “disc map” data structure in the S5FS filesystem or the use of hash functions in page tables, file systems and directory listings.

Report assessment criteria: 40% of overall mark

1. Discussion of complexity [20 marks]

To achieve 15-20 marks you must show an excellent understanding of both the theory behind this data structure and of your actual implementation. You will have taken into account the complexity of all methods that are used by the encode and decode methods. You will use and demonstrate an understanding of the correct terminology regarding complexity. 5-15 marks will be awarded to work that provides less detail but in a way which is nevertheless accurate and which demonstrates an understanding of the question.

2. Data structures and operating systems [20 marks]

To achieve 15-20 marks you will have shown some good independent research and made thorough use of the module material and/or other sources to discover several applications of data structures and algorithms in the domain of operating systems. You will have explained these in a thorough and convincing way in your own words. 5-15 marks will be awarded to work that provides less detail but in a way which is nevertheless accurate and which demonstrates an understanding of the question.

In addition to these module-specific marking criteria, I will apply the University of Brighton Grading Descriptors for Undergraduate Work when marking your submission. A copy of this document can be found in the Assessment area for this module on studentcentral.

Any non-original code must be clearly identified using comments within the submitted code and an accurate reference with an appropriate license must be included within the references section of the report

2 Submission

The work must be submitted through the module area on StudentCentral. Submit the implementation and report as a single Zip file.

Any work submitted later than 15:00 on 10 Jan 2021 will be treated as late.

3 Deliverables

The deliverables consist of:

3. A functioning Java program meeting the requirements defined above.

4. Report.

Data structures assignment/week4b

CI583: Data Structures and Operating Systems

Algorithmic strategies to solve NP-hard problems

1 / 15

Depth-�rst and breadth-�rst

Many NP problems can be reduced to a problem in which the
nodes in a graph must be visited exactly once � this is a more
general idea of traversal, which we used with trees.

For instance, we may need to transmit a network packet to every
computer on a network, making sure that no computer receives it
twice.

There are two ways we might do this: depth-�rst and breadth-�rst.

2 / 15

Depth-�rst and breadth-�rst

A depth-�rst traversal will go as far as possible down a given path
before it considers any other. A breadth-�rst traversal goes evenly
in many directions.

7 8 9

456

321

3 / 15

Depth-�rst traversal

In a depth-�rst traversal, we visit the starting node then follow
edges through the graph until we reach a dead-end. In an
undirected graph a node is a dead end if all nodes adjacent to it
have been visited. In a directed graph a node is also a dead end if it
has no outgoing edges.

7 8 9

456

321

4 / 15

Depth-�rst traversal

When we reach a dead end we go back up the path until we �nd a
node with an unvisited adjacent node. One traversal of this graph
starting at 1 reaches a dead end at 6: [1, 2, 3, 4, 7, 5, 6].

7 8 9

456

321

5 / 15

Depth-�rst traversal

We then go back to node 7 and �nd that 8 is unvisited. We visit 8
and reach a dead end.

7 8 9

456

321

6 / 15

Depth-�rst traversal

We then go back to 4 and �nd that 9 is unvisited. The next time
we go back up the path we end up at the starting node and we are
done.

7 8 9

456

321

7 / 15

Breadth-�rst traversal

With a breadth-�rst traversal we �rst visit all nodes which are
adjacent to the starting node. Then we visit all nodes which are
two edges away from the starting node, and so on. For the graph
below, one traversal starting at 1 is [1, 2, 8, 3, 7, 4, 5, 9, 6].

7 8 9

456

321

8 / 15

Breadth-�rst traversal

7 8 9

456

321

9 / 15

Breadth-�rst traversal

7 8 9

456

321

10 / 15

Breadth-�rst traversal

7 8 9

456

321

11 / 15

Breadth-�rst traversal

7 8 9

456

321

12 / 15

Traversal

Implementations of depth-�rst traversal often use a stack.
Implementations of breadth-�rst traversal often use a queue. Give
DF and BF traversals starting at the node labelled 1 for this
directed graph:

5 4 3

13 / 15

Weighted graphs

Consider the problem of �nding �cheapest� paths in a directed and
weighted graph:

5 4 3

2
5

14 / 15

Minimum spanning trees

The problem of �nding cheapest or shortest paths in a a weighted
and connected graph reduces to that of �nding the minimum
spanning tree (MST). A spanning tree for a graph, G, contains all
the nodes of G and a subset of the edges and has no cycles. A
connected graph is one in which there is a path from every node to
any other.

The MST for a graph, G, is a spanning tree in which the total of
the edge weights is minimal.

We can calculate the MST by brute force � for n edges this takes
2n comparisons between potential MSTs � this is an NP problem.

15 / 15

Depth-first and breadth-first
Greedy algorithms

__MACOSX/Data structures assignment/._week4b

Data structures assignment/week8a

Processes and threads

CI583: Data Structures and Operating Systems
OS Design Principles

1 / 17

Processes and threads

Outline

1 Processes and threads

2 / 17

Processes and threads

Conceptually, the thread is a unit of computation, to be stopped
and started by the OS.

T
im

Process

Thread #1 Thread #2

3 / 17

Processes and threads

Each thread belongs to an enclosing process, which can also be
stopped and started by the OS and which is used to store context
which includes:

an address space (the set of memory locations that contains
the code and data for the program),

a list of references to open files, and

other information might be common to several threads.

4 / 17

Processes and threads

To represent a process we use a data structure called a process
control block (PCB), containing references to all the info given
above and to a list of threads.

Each thread is represented by a thread control block (TCB). The
TCB contains references to thread-specific context, including the
thread’s stack and contents of registers.

stack pointer

other registers

current state

stack pointer

other registers

current state

Address space
description

open file descriptors

list of threads

current state

PCB TCB

stacks

stack pointer

other registers

current state

5 / 17

Processes and threads

The OS needs to be able to create, delete and synchronise
processes.

In Windows and *NIX a process is either active or terminated.

Terminated processes are deleted (by a process dedicated to this
purpose) when all of its threads are terminated and there are no
more references to the process from elsewhere.

6 / 17

Processes and threads

When a process is created, the address space is loaded into
memory. On Linux, this is accomplished by the exec program.

How does the OS initialise the address space?

One approach would be to copy the code and data of the program
into the address space, but this would be inefficient.

7 / 17

Processes and threads

The code section of the program is read-only and so can be shared
by any processes executing the same program.

The parts of the data section that are never modified can also be
shared.

8 / 17

Processes and threads

A better approach is to map the executable file into the address
space.

Both the address space and the executable file are divided into
blocks of equal size called pages.

9 / 17

Processes and threads

The text regions of all processes running this executable are set up
using hardware-address translation facilities, with each process
mapping to the same location.

The data regions of each process initially refers to a single copy of
the data portion of the executable, which has been copied into
memory.

10 / 17

Processes and threads

Processes and Threads

$ find / -amin +1$ find ./ -name Main.java

Process A

TEXT

DATA

Process B

TEXT

DATA

Memory Map

A: TEXT

A: DATA

B: TEXT

B: DATA

11 / 17

Processes and threads

When a process modifies data for the first time, it is given a new,
private page containing a copy of the pristine page.

This is called a private mapping.

Modern systems also use the notion of a shared mapping: when
data is modified the original page is altered and all processes see
the change.

12 / 17

Processes and threads

$ find / -amin +1$ find ./ -name Main.java

Process A

TEXT

DATA

Process B

TEXT

DATA

Memory Map

A: TEXT

A: DATA

B: TEXT

B: DATA

13 / 17

Processes and threads

When a thread is created it is in the runnable state.

At some point it is switched into the running state by the
scheduler (which we will come back to in more detail).

A thread that is running may then be put into the waiting state,
for instance because it has initiated an I/O action and needs to
wait for the result before doing anything else.

14 / 17

Processes and threads

The thread can put itself into the waiting state, or can be moved
into the state by the OS.

If the OS uses time-slicing, whereby threads are run for a
maximum period of time before relinquishing the processor, the
scheduler can move the thread into the waiting state.

15 / 17

Processes and threads

A thread can move itself into the terminated state, or it can be
moved there by the OS.

Note that a thread must be in the running state at least once
before terminating.

Terminated threads are removed in various ways, such as by a
dedicated “reaper” process.

16 / 17

Processes and threads

Running

Waiting

TerminatedRunnable

17 / 17

Processes and threads

__MACOSX/Data structures assignment/._week8a

Data structures assignment/week9e

Journalled file systems

CI583: Data Structures and Operating Systems
Journalled File systems

1 / 16

Journalled file systems

In journalling, the steps of a transaction are written to a journal
before being committed, or written to disk.

If anything goes wrong whilst the changes are being written to
disk, a recovery procedure repeats the steps from the journal when
the system restarts.

If any change is made twice, that’s not a problem because changes
are idempotent – carrying them out n times is the same as carrying
them out once.

2 / 16

Journalled file systems

There are two ways to do this:

redo journalling (as described above), and

undo journalling, in which the old contents of data blocks are
stored in the journal, allowing the user to get back to the
previous consistent state after a crash.

3 / 16

Journalled file systems

Making every change twice (once in a journal, once for real) risks
doubling the amount of work, and journalling would not have been
practical on the systems of the 70s.

However, aggressive caching and other techniques mean that we
can minimise the work involved by batching up sets of changes and
carrying them out in one go, rather than writing to the journal
every time one byte is altered.

4 / 16

Journalled file systems

How big, then, should a transaction be?

Deleting a large file might require millions of operations, so that a
single task is too big for the journal.

Carrying out each small change in its own transaction would be
inefficient.

5 / 16

Journalled file systems

In practice, small changes are batched together into one
transaction and big changes are broken up into several transactions.

Some systems (e.g. ext3) use time-based transactions.

The important thing is that we respect the ACID rules and take
the system from one consistent state to another.

6 / 16

Journalled file systems

Shadow-paged file systems

Journalled file systems ensure consistency but involve rather a lot
of work. There is a simpler solution: shadow-paged file systems.

Like journalling, shadow-paged systems are only viable in a context
where memory is plentiful and processors are fast.

Examples include ZFS from Sun.

7 / 16

Journalled file systems

Shadow-paged file systems

The idea is simple – the whole file system is represented in memory
as a data structure called the shadow-page tree, the root of which
is called the überblock.

The tree contains pointers to all metadata and data blocks.

8 / 16

Journalled file systems

Shadow-paged file systems

Whenever a disk block is about to be changed a copy is made and
it is that which is modified – the original is kept unchanged.

To link the modified copy into the tree, the parent node must be
altered, but instead of changing it directly, that node is also
copied, and so on, up to the root node.

This procedure is called copy-on-write.

9 / 16

Journalled file systems

Shadow-paged file systems

The root node is modified directly in a single disk write.

If the system crashes while an update is in progress but before the
root node has been altered, the system comes back up with the old
copy of the tree.

By keeping the original root node, we have a snapshot of the
system as a whole.

10 / 16

Journalled file systems

Shadow-paged file systems

root

indirect
inode blocks

direct
inode blocks

indirect
data blocks

direct
data blocks

11 / 16

Journalled file systems

Shadow-paged file systems

root

indirect
inode blocks

direct
inode blocks

indirect
data blocks

direct
data blocks

12 / 16

Journalled file systems

Shadow-paged file systems

root

indirect
inode blocks

direct
inode blocks

indirect
data blocks

direct
data blocks

13 / 16

Journalled file systems

Shadow-paged file systems

root

indirect
inode blocks

direct
inode blocks

indirect
data blocks

direct
data blocks

14 / 16

Journalled file systems

Shadow-paged file systems

root

indirect
inode blocks

direct
inode blocks

indirect
data blocks

direct
data blocks

15 / 16

Journalled file systems

Shadow-paged file systems

root

indirect
inode blocks

direct
inode blocks

indirect
data blocks

direct
data blocks

16 / 16

Journalled file systems

__MACOSX/Data structures assignment/._week9e

Data structures assignment/week9d

Resiliency

CI583: Data Structures and Operating Systems
File systems: Resiliency

1 / 12

Resiliency

Early file systems such as S5FS performed badly in the face of
crashes and other unexpected shutdowns.

Not only could the files which were open at the time be corrupted,
but other completely unrelated files could also be damaged.

It is easy to understand how unwritten modifications to a file could
be lost, but how were other files being affected?

2 / 12

Resiliency

This happened because data structures which describe the whole
file system, such as the superblock and the I-list, could easily
become corrupted.

This is even more serious than losing the latest version of a user’s
file.

In the worst case, two separate inodes could point to the same
data, or system files could be marked as free space, etc!

3 / 12

Resiliency

The problem stems from the fact that many operations that we
think of as a “semantic unit”, such as the system call write,
actually comprise many operations.

Consider the task of adding an element to back of a queue.

In combination with caching, an interruption here can lead to the
“last” element in the queue being some random memory location.

4 / 12

Resiliency

Resiliency
Corrupting a data structure

. . .

5 / 12

Resiliency

Resiliency
Corrupting a data structure

. . .
.

6 / 12

Resiliency

Resiliency
Corrupting a data structure

. . .
.

7 / 12

Resiliency

Resiliency
Corrupting a data structure

. . .
?

8 / 12

Resiliency

When a cache is being used, the user may actually have instructed
the OS to save their work and been given feedback that the task is
done, whilst in fact the changes are still in the cache and will not
survive a crash.

We require the metadata structures on disk (list of free space,
inodes, diskmaps within inodes, directories, etc) to be well formed
at all times even if, at a push, they might not contain the latest
updates.

If this is the case, then the data is consistent.

9 / 12

Resiliency

The measures taken by file systems to ensure consistency have
normally focused on metadata consistency, rather than the
contents of user files, although this is obviously important to the
user.

One approach is to ensure that every change to the system
(write, creat, rename, unlink) takes the system from one
consistent state to another.

This is called a consistency-preserving approach, as used by FFS
(80s UNIX file system, one of the successors to S5FS) for example.
This is difficult in practice as every system change involves many
individual changes.

10 / 12

Resiliency

A second approach is called the transactional approach.

Using this approach, we collect updates into groups called
transactions, inspired by the database world.

This group of updates can be treated as a single action with the
ACID properties.

11 / 12

Resiliency

Atomic: each transaction is all-or-nothing – either all of it
takes place or none of it does.

Consistent: the system is always consistent. None of the
inconsistent states that might exist while a transaction is
taking place are visible outside of the transaction.

Isolated: a transaction has no effect until it is committed to
disk, and is not affected by any other ongoing (uncommitted)
transactions.

Durable: once a transaction is committed, its effects persist.

The two main approaches to providing these properties are
journalling and shadow-paging, which we will look at in turn.

12 / 12

Resiliency

__MACOSX/Data structures assignment/._week9d

Data structures assignment/week4c

CI583: Data Structures and Operating Systems

Algorithmic strategies to solve NP-hard problems

1 / 23

Greedy algorithms

Greedy algorithms work by looking at a subset of a larger problem

and �nding the best solution for that subset.

Dijkstra’s Shortest Path algorithm is a greedy algorithm developed

in the late 50s. At each step it looks at adjacent edges and decides

which one to add to the spanning tree. By doing this repeatedly,

we end up with a spanning tree that has the minimum overall total,

the MST.

2 / 23

Dijkstra’s shortest path

The algorithm works by putting all nodes into one of three

categories:

1 in the tree: nodes already added to the spanning tree,

2 on the fringe of the tree: those nodes adjacent to the current

node, and

3 not yet considered.

3 / 23

Dijkstra’s shortest path

The algorithm, informally:

1 Select a starting node.

2 Build the initial fringe from nodes adjacent to

the starting node.

3 While there are nodes not yet considered, do

1 Choose the edge in the fringe with the smallest

weight.

2 Add the associated node to the tree.

3 Continue with the new selected node and updated

fringe.

4 / 23

Backtracking

The class of problems related to �nding a path through a maze or

avoiding a series of obstacles can be solved by backtracking

algorithms. This type of algorithm makes choices at each step and,

when it reaches a dead end, retraces its steps to a point at which

an alternative choice can be made.

Image c©http://www.liyenchong.com

5 / 23

http://www.liyenchong.com

Backtracking

Backtracking techniques save the state of the problem each time a

choice is made. These techniques are not only useful for

path-�nding.

Consider the N-queens problem. This problem asks how to position

N queens on an N × N chess board in such a way that they don’t
threaten each other.

6 / 23

4-queens

We can develop a state space tree to describe the 4-queens

problem. Note that no row, column or diagonal can contain more

than one queen. Each level of the state space tree represents the

possible places where each queen can be placed for one of the rows

of the board.

The state space tree will have 256 leaves, with each path from the

root to a leaf representing a possible solution. Most of these are

not solutions of course.

7 / 23

4-queens

root

1,1 1,31,2 1,4

2,1 2,2 2,3 2,4 2,1 2,2 2,3 2,4

8 / 23

4-queens

Given the state space tree, we can carry out a depth-�rst traversal

to place 4 queens on the board, then check whether they can

attach each other:

root

1,2

2,4

3,1

4,3

9 / 23

4-queens

What is wrong with this approach?

Using this approach, much more work is done than necessary. As

soon as we place a queen so that it threatens another, a path that

passes through that node will not lead to a solution and we call

that a nonpromising node. So there is no need to populate or

search subtrees with a nonpromising root.

A backtracking recursive solution to n-queens would stop recursing

whenever it reaches a nonpromising node. Note that, in this case,

the algorithm does not literally need to retrace its steps.

10 / 23

4-queens

What is wrong with this approach?

Using this approach, much more work is done than necessary. As

soon as we place a queen so that it threatens another, a path that

passes through that node will not lead to a solution and we call

that a nonpromising node. So there is no need to populate or

search subtrees with a nonpromising root.

A backtracking recursive solution to n-queens would stop recursing

whenever it reaches a nonpromising node. Note that, in this case,

the algorithm does not literally need to retrace its steps.

10 / 23

Backtracking

The general strategy:

procedure SearchSpace(i)
if there is a solution then

output the solution

else

for every possible next step do

if the step is promising then

SearchSpace(i+1)

end if

end for

end if

end procedure

11 / 23

Dynamic programming

Dynamic programming techniques include algorithms in which the

most e�cient solutions depend on choices that might change with

time.

The key feature to dynamic programming is memoisation � this is

the technique of storing the result of expensive computations in

order to reuse them.

12 / 23

Dynamic programming

This Java method calculates the nth element in the Fibonacci
sequence (0, 1, 1, 2, 3, 5, 8 . . .):

1 int fibonacci(int n) {

2 if (n<2) return n; 3 return fibonacci(n-1)+fibonacci(n-2); 4 5 } 13 / 23 Dynamic programming This is very ine�cient. To calculate fibonacci(10) we calculate fibonacci(9) and fibonacci(8). To calculate fibonacci(9) we calculate fibonacci(8) (again) and fibonacci(7), and so on. A solution that uses memoisation: 1 int fibonacci(int n) { 2 if (n<2) return n; 3 int[] data = new int[n]; 4 data [0] = 1; 5 data [1] = 1; 6 for(int i=2;i= rob

Demo

8 / 9

Efficient directories

Next time

Memory management: virtual memory, page tables, etc.

9 / 9

Efficient directories

__MACOSX/Data structures assignment/._week9f

Data structures assignment/week8b

Managing hardware Shared libraries

CI583: Data Structures and Operating Systems
OS Design Principles

1 / 21

Managing hardware Shared libraries

Outline

1 Managing hardware

2 Shared libraries

2 / 21

Managing hardware Shared libraries

Managing hardware

In an earlier lecture we discussed device drivers, which contain the
code that knows how to interact with a given device.

How do we make sure that the right driver is associated with the
right device, that all devices are properly initialised when the
system starts, or that new devices are recognised correctly when
attached?

3 / 21

Managing hardware Shared libraries

Managing hardware

In early UNIX systems the kernel had hard-coded support for the
relevant devices. In order to add a new device, the user needed to
recompile the kernel image.

Each device was identified by a device number which was a pair of
the major device number, which identified the driver, and minor
device number, which identified the particular device among those
handled by this driver.

4 / 21

Managing hardware Shared libraries

Managing hardware

Special files were created on the file system in the /dev directory
to refer to devices.

These “special” files refer to the device using its device number.
Thus, when an application opens the “file” /dev/sda1 the kernel
uses the right device and driver.

5 / 21

Managing hardware Shared libraries

Managing hardware

This approach was laborious and inadequate for the number of
devices that can be used with today’s systems.

The modern approach uses what we might call meta-drivers, each
responsible for the class of devices that can use a particular bus,

So, the USB driver knows about USB devices and probes the
system at start-up time to see which are attached.

6 / 21

Managing hardware Shared libraries

Managing hardware

The correct drivers and kernel modules are then loaded to initialise
these devices.

Whilst a modern Linux system is running, the udev “daemon”
listen for the connection and removal of devices and updates the
contents of /dev accordingly.

Most of what udev does is in userland (doesn’t require special
privileges).

7 / 21

Managing hardware Shared libraries

Interacting with a device

We begin by considering how the OS might interact with a very
simple device – a terminal, which takes user input at the command
line and displays the results in a simple text-only UI.

Although this might seem like
an obsolete example, the way
that the OS reads from and
writes to this device is
something that can be adapted
for many contexts.

8 / 21

Managing hardware Shared libraries

Interacting with a device

We can see straight away that it’s not enough to read one
character at a time from the device, or to write data straight to
the device. Why not?

1 Data may be sent to the device faster than they can be
processed, or generated by the user faster than the application
can accept it.

9 / 21

Managing hardware Shared libraries

Interacting with a device

2 Chars may arrive from the keyboard when there is no waiting
read request.

10 / 21

Managing hardware Shared libraries

Interacting with a device

3 Input may need to be processed in some way before they read
the application.

For instance, chars need to be echoed to the screen so that
the user can see what they are typing.

Chars may be grouped into lines which can be edited before
submitting by pressing enter.

The terminal may support tab-completion.

11 / 21

Managing hardware Shared libraries

Interacting with a device

Items 1 and 2 are examples of the producer-consumer problem, in
which two processes need to synchronise their access to a finite
queue or buffer.

The producer’s job is to generate a piece of data, put it into the
buffer and start again.

12 / 21

Managing hardware Shared libraries

Interacting with a device

At the same time, the consumer is consuming the data (i.e.,
removing it from the buffer) one piece at a time.

The problem is to make sure that the producer won’t try to add
data into the buffer if it’s full and that the consumer won’t try to
remove data from an empty buffer.

13 / 21

Managing hardware Shared libraries

Interacting with a device

The OS will communicate with many devices using the same
model of input and output queues (even a window manager, which
controls the gui – here the input comes from keyboard and mouse,
and the window manager needs to keep track of which application
window has the focus, whilst the output is a stream of requests to
repaint parts of the display using new data).

We separate this common functionality into something called a line
discipline module.

14 / 21

Managing hardware Shared libraries

Shared libraries

We have already mentioned the fact that each OS has an API and
that high-level languages provide convenient calls to the API in
standard libraries.

As well as wrappers for the API, standard libraries contain large
numbers of standard functions for convenience.

15 / 21

Managing hardware Shared libraries

Shared libraries

These shared libraries are called dynamic-linked libraries on
Windows and shared objects on Linux.

One advantage of using them is that they need not be loaded until
needed, improving the start-up time of a program.

16 / 21

Managing hardware Shared libraries

Shared libraries

The biggest advantage is code reuse – few programmers would
attempt GUI programming without a library.

One disadvantage is that they bring increased complexity (DLL A
depends on DLL B, which depends on DLL C – are the versions of
these DLLs compatible?) – this complexity is greatly tamed by
managed runtime environments like .NET or the JVM.

17 / 21

Managing hardware Shared libraries

Shared libraries

When a program is converted into machine code that can be
executed directly by the OS, it needs to contain everything it needs
to run.

When we include a call to a shared function in our program, such
as a call to the C function printf (or, more indirectly,
System.out.println in Java), a compiler could take one of
several choices:

18 / 21

Managing hardware Shared libraries

Shared libraries

Approaches to shared libraries:

1 Include a copy of printf with our program. This would make
our programs much larger than they need to be and be a
waste of storage.

2 Load a copy of printf at runtime, along with our program.
There would only be one copy on disk but each program would
load it into memory: again, this would be very inefficient.

19 / 21

Managing hardware Shared libraries

Shared libraries

Approaches to shared libraries:

3 Ensure that all processes that make use of printf refer to
the same known location, L. If some process wants to make
use of that location for another purpose, move the single copy
of printf to another location, M. This is the approach taken
by Windows.

4 Ensure that each process that calls on printf maintains a
table mapping an arbitrary memory location, L1, to the real
location in memory of printf, L2. This approach is called
position-independent code and is the approach used by most
modern UNIX systems.

20 / 21

Managing hardware Shared libraries

After the break

Storage, virtual memory, virtualisation, scheduling.

21 / 21

Managing hardware
Shared libraries

__MACOSX/Data structures assignment/._week8b

Data structures assignment/week8c

Scheduling

CI583: Data Structures and Operating Systems
Scheduling

1 / 20

Scheduling

We have said that the main purpose of an OS is to manage
resources, and one of the ways of doing this is to ensure that
processes and threads have access to a “fair” amount of processor
time.

Managing the access to processors is called scheduling. This is a
dynamic process – the scheduler has to respond immediately to
demands for processor time, as threads move in and out of the
runnable state.

2 / 20

Scheduling

At the highest level, we can consider that we have a list of
runnable threads which we want to sort so that the most urgent
ones are at the front of the list. But what do we mean by
“urgent”? That could be a question of:

giving priority to interactive threads,

giving priority to system-level threads,

maximising the number of threads processed per unit of time,

a combination of the above.

3 / 20

Scheduling

Scheduling strategies

The strategy we use will depend on the type of system we are
designing:

1 Simple batch systems (SBS): Each process runs to completion
and the job of the scheduler is to pick the next task to be run.

The two main considerations are system throughput and
average wait time.

4 / 20

Scheduling

Scheduling strategies

2 Multiprogrammed batch systems: Same as SBS but several
jobs are run concurrently.

The considerations from the SBS are supplemented with the
questions of how many jobs should be running at any one
time and how the processor time is apportioned amongst
running jobs.

5 / 20

Scheduling

Scheduling strategies

3 Time-sharing systems: here the main question becomes
apportioning time to the jobs that are ready to execute – the
runnable threads.

The main concern is response time – the time from when a
command is given to when it is completed. Short requests
should be completed quickly.

6 / 20

Scheduling

Scheduling strategies

4 Shared servers: A single computer being used by many clients.

The question arises here of giving each client a reasonable
apportionment of time, instead of focusing only on runnable
threads.

7 / 20

Scheduling

Scheduling strategies

5 Real-time systems: Can be soft or hard.

An example of soft real-time would be video processing – data
must be processed in a strictly synchronised fashion and as
efficiently as possible, but some lag is not a disaster.

An example of hard real-time would be autonomous vehicle
software that alters the direction of a car – certain commands
must be handled in a timely way.

8 / 20

Scheduling

Simple Batch Systems

On an SBS, jobs are run one at a time and, when a job ends, we
can choose between several queued jobs to decide which to run
next.

There are two approaches we could take:

1 First-in-first-out (FIFO): jobs are executed in the order they
were submitted to the system.

2 Shortest-job-first (SJF): some (relatively crude) measure is
used to decide how long a job is going to take, and the
shortest will be executed first.

9 / 20

Scheduling

Simple Batch Systems

FIFO might seem fairest, but if we take average throughput as the
measure of effectiveness for our scheduler, then SJF will perform
best.

However, without further measures, SJF could mean that a very
long job is queued indefinitely, so we need to make sure that
doesn’t happen.

10 / 20

Scheduling

Multiprogrammed Batch Systems

An MBS is essentially the same as SBS except that several jobs are
held in memory at one time and time-slicing is used to switch
focus between the currently active job.

To do this we need to decide how many jobs to hold in memory at
any one time and to decide on a time quantum – the amount of
time to allocate to a job before it is preempted in favour of the
next job.

A solution which gets round some of the problem with SJF is to
hold multiple queues of relatively short and relatively long jobs, so
a short and a long job will be held in memory at any one time.

11 / 20

Scheduling

Time-Sharing Systems

A TSS is one which has several users logged on at any one time.

The main concern of the scheduler is that the system should feel
responsive. A job which ought to be short should be short.

12 / 20

Scheduling

Time-Sharing Systems

If a job that normally takes 3 or 4 minutes, such as compiling some
code, takes 5 minutes, users probably won’t mind.

But if a job that should be very quick, such as opening a file in a
word processor, takes 1 minute then the users will think the system
is slow.

13 / 20

Scheduling

Time-Sharing Systems

A scheduling strategy for a TSS should favour short and interactive
operations at the possible expense of longer ones.

As a first attempt, consider a time-sliced scheduler that uses a
round robin strategy. When a thread uses up its time quantum, it
is moved to the back of the queue and the next thread gets to run.

14 / 20

Scheduling

Time-Sharing Systems

As we want to prioritise short, interactive operations, we can
modify this strategy to assign a priority to each thread. We can
use this notion of high/low priority to move short, interactive
threads to the top of the list or to maintain multiple queues.

SHORT

SHORT/
INTERACTIVE

LONG

15 / 20

Scheduling

Time-Sharing Systems

Determining the level of interactivity of a process will necessarily
involve some guesswork.

A more effective algorithm is to reduce the priority of a thread
each time it uses a time quantum.

16 / 20

Scheduling

Time-Sharing Systems

The first time a thread runs, it does so at the highest priority level
and, presuming it isn’t completed, its priority is reduced.

The next time it runs, it’s priority is reduced again, and so on.
This is called a multilevel feedback queue.

In this system, threads in low priority queue can only run if there
are no threads in a higher queue.

17 / 20

Scheduling

Time-Sharing Systems

HIGH

LOW

18 / 20

Scheduling

Time-Sharing Systems

However, we can improve on this by taking other factors than the
number of time quanta consumed into account.

When we run a thread, we can modify its priority based on how
long it has been waiting since it last ran (e.g. it may have been in
the paused state for a long time, waiting for an IO operation to
end).

19 / 20

Scheduling

Time-Sharing Systems

We can sum this up by saying a thread’s priority gets worse while it
is running, and better while it is waiting.

This is the strategy used by the schedulers in Windows and Linux
today.

At the same time, each OS allows the user some way of stating
how important they consider a process and its threads to be (on
Linux, this is the nice command).

20 / 20

Scheduling

__MACOSX/Data structures assignment/._week8c

Data structures assignment/week5d

CI583: Data Structures and Operating Systems

Compression algorithms

1 / 25

Outline

1 Compression

2 Run-length encoding

3 Hu�man coding

2 / 25

Data, data everywhere!

In the digital age, we’re drowning in data. Google pioneered the
�Big Data� era by never throwing anything anyway and storing
loosely connected or unconnected data with the aim of applying
meaning to it later on.

Almost all data generated is metadata produced automatically.

The global data volume is estimated to have exceeded 64
zettabytes in 2020.

3 / 25

Data, data everywhere!

It’s hard to imagine how much data is contained in 64 zettabytes…

1 zettabyte = 1000 exabyte = 1 million terabytes = 1 trillion GB

It has been estimated that all human words ever spoken could be
encoded into 5 exabytes1.

1
New York Times: http://tx0.org/5ls

4 / 25

http://tx0.org/5ls

Compression algorithms

Storage might be relatively cheap but the fact that we can store so
much data and transfer it across networks etc is largely thanks to
e�ective and widely used compression algorithms which retain, to a
greater or lesser degree, the meaning of the input.

The �eld that studies this is called Information Theory, combining
mathematics and computer science, and founded by Claude
Shannon with a series of landmark works in the 1940s and 50s.

A compression algorithm can be lossless (information-preserving) or
lossy (some information considered non-essential is discarded).

5 / 25

Run-length encoding

Perhaps the simplest example of a lossless encoding is a run-length
encoding (RLE). This works by replacing repetition with a single
value and a count. Thus, if we have a protocol that encodes the
pixel values of one row of a black-and-white image like so:

WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWW

WWBWWWWWWWWWWWWWW

this has the following RLE:

12 W1B12W3B24W1B14W

6 / 25

Optimising RLE

RLE algorithms are simple and can be extremely e�ective especially
for data with little variation, where savings of up to 90% are not
unusual.

To store data with more variation in it, such as text, we can
optimise the process by transforming the input before applying the
RLE. Obviously, such a transformation needs to be reversible so
that we can recover the original.

7 / 25

Burrows-Wheeler Transform

The Burrows-Wheeler Transform (BWT) transforms a sequence of
characters so that the most commonly repeating characters are
adjacent to each other, making the application of RLE more
e�ective. That would be easy enough in itself � just sort the input
� but, marvellously enough, BWT is also reversible.

This transformation is used in the bzip compression format, widely
used on UNIX systems.

8 / 25

Burrows-Wheeler Transform

Take some input, e.g. the word billing. To optimise this for RLE
we want multiple occurrences of the same character to be next to
each other. Put special characters at the start and end of the
input, e.g. ^billing$.

Form the set of rotations of the input. We rotate a word by moving
one character from the beginning to end.

9 / 25

Burrows-Wheeler Transform

^billing$

billing$^

illing$^b

lling$^bi

ling$^bil

ing$^bill

ng$^billi

g$^billin

$^billing

Then sort this table in lexicographic order, where x < ^ < $. 10 / 25 Burrows-Wheeler Transform billing$^ g$^billin illing$^b ing$^bill ling$^bil lling$^bi ng$^billi ^billing$ $^billing Then the last column of the table is the BWT of the input: ^nbllii$g. 11 / 25 Burrows-Wheeler Transform To take the inverse BWT, i.e. to recover the original, we can insert the BWT as one column of a table, then sort the rows of the table, add another column, sort, and so on. When we have added all columns and sorted for the last time, the row which ends with $ is the original input. Try it yourself with a short string. As well as being used with �regular� text �les, BWT is often applied to large amounts of data in bioinformatics � e.g. sequences of millions of characters or bits representing DNA. 12 / 25 Hu�man coding In textual data, each character typically takes 8 (ASCII) or 16 (UTF-16) bits. Clearly, we should be able to save some space just by encoding text in a binary format. Hu�man coding is an elegant compression algorithm that combines ideas from RLE (i.e. it exploits frequency information) with binary encoding to save space. Hu�man coding was invented by David Hu�man in 1952. 13 / 25 Hu�man coding The idea is to generate a binary sequence that represents each character required. This might be the English alphabet, some subset of that, or any collection of symbols. Say we have a 100KB �le made up of repetitions of the letters a to f. We start by creating a frequency table: a b c d e f Frequency (thousands) 45 13 12 16 9 5 14 / 25 Hu�man coding If we use a �xed-length code we can encode this data in about 37.5KB. If we use a variable-length code and assign the shortest code to the most frequently used characters, we can encode it in just 28KB. a b c d e f Frequency (thousands) 45 13 12 16 9 5 Code (�xed-length) 000 001 010 011 100 101 Code (variable-length) 0 101 100 111 1101 1100 15 / 25 Hu�man coding To send this data over the wire or store it in a �le, we need to send/store the mapping from code to characters followed by the concatenation of all binary codes as they appear in the unencoded document. Any sequence of codes must be unambiguous � we can only decode this at the other end if no code is a pre�x of any other. If we had used 1 for c and 11 for d, how would we decode 111? a b c d e f Code (variable-length) 0 101 100 111 1101 1100 16 / 25 Hu�man coding To send this data over the wire or store it in a �le, we need to send/store the mapping from code to characters followed by the concatenation of all binary codes as they appear in the unencoded document. Any sequence of codes must be unambiguous � we can only decode this at the other end if no code is a pre�x of any other. If we had used 1 for c and 11 for d, how would we decode 111? a b c d e f Code (variable-length) 0 101 100 111 1101 1100 16 / 25 Hu�man trees We can create these variable-length codes using a binary tree (not a search tree). In a Hu�man Tree the leaves contain the data, a character and its frequency. Internal nodes are labelled with the combined frequencies of their children. 100 55 25 a: 45 d: 16c: 12 0 1 0 b: 13 1 0 30 14 f: 5 e: 9 0 0 1 1 1 17 / 25 Hu�man trees To decode data we go start at the root and go left for 0, and right for 1 until we get to a leaf. So, to decode 0101100, we start at the root, read an a, start at the root again, and so on to read abc. An optimal Hu�man tree is full. 100 55 25 a: 45 d: 16c: 12 0 1 0 b: 13 1 0 30 14 f: 5 e: 9 0 0 1 1 1 18 / 25 Creating the Hu�man coding The most elegant part of this scheme is the algorithm used to create the tree: 1 Make a tree node object for each character, with an extra label for its frequency. 2 Put these nodes in a priority queue, where the lowest frequency has highest priority. 3 Repeatedly: 1 Remove two nodes from the queue and insert them as children to a new node. The char label of the new node is blank and the frequency label is the sum of the labels of its children. 2 Put the new node back in the queue. 3 When there is only one item in the queue, that's the Hu�man tree. 19 / 25 Creating the Hu�man coding a: 45d: 16c: 12 b: 13f: 5 e: 9 20 / 25 Creating the Hu�man coding a: 45d: 16c: 12 b: 13 f: 5 e: 9 14 21 / 25 Creating the Hu�man coding a: 45d: 16 f: 5 e: 9 14 c: 12 b: 13 25 22 / 25 Creating the Hu�man coding a: 45 c: 12 b: 13 25 d: 16 f: 5 e: 9 14 30 23 / 25 Creating the Hu�man coding a: 45 c: 12 b: 13 25 d: 16 f: 5 e: 9 14 30 55 24 / 25 Creating the Hu�man coding 100 55 25 a: 45 d: 16c: 12 0 1 0 b: 13 1 0 30 14 f: 5 e: 9 0 0 1 1 1 25 / 25 Transmitting the Hu�man coding To use a Hu�man coding as part of a compressed �le format we begin the �le with a �magic number� identifying the format used and the length of the alphabet, A, used by the �le. We then store the table as an array in the next |A| bits, followed by one code after another until the EOF character. This scheme is used by some of the most widely used compressed formats such as GIF and ZIP. 26 / 25 Compression Run-length encoding Huffman coding __MACOSX/Data structures assignment/._week5d Data structures assignment/MemoryManagementSimulator CI283 Operating Systems Memory Management Simulator Purpose In this exercise, you will use a Memory Management Simulator. This guide explains how to use the simulator and describes the display and the various input files used and the output files produced by the simulator. The memory management simulator illustrates page fault behaviour in a paged virtual memory system. The program reads the initial state of the page table and a sequence of virtual memory instructions and writes a trace log indicating the effect of each instruction. It includes a graphical user interface so that you can observe page replacement algorithms at work. Download memory.zip form Studentcentral and unzip it. Running the Simulator The program reads a command file, optionally reads a configuration file, displays a GUI window which allows you to execute the command file, and optionally writes a trace file. To run the program, first compile all Java files and then enter the following command line. $ java MemoryManagement commands memory.conf The program will display a window allowing you to run the simulator. You will notice a row of command buttons across the top, two columns of "page" buttons on the left, and an informational display on the right. Typically, you will use the step button to execute a command from the input file, examine information about any pages by clicking on a page button, and when you're done, quit the simulation using the exit button. The buttons: Button Description run runs the simulation to completion. Note that the simulation pauses and updates the screen between each step. step runs a single setup of the simulation and updates the display. reset initializes the simulator and starts from the beginning of the command file. exit exits the simulation. page n display information about this virtual page in the display area at the right. The informational display: Field Description status: RUN, STEP, or STOP. This indicates whether the current run or step is completed. time: number of "ns" since the start of the simulation. instruction: READ or WRITE. The operation last performed. address: the virtual memory address of the operation last performed. page fault: whether the last operation caused a page fault to occur. virtual page: the number of the virtual page being displayed in the fields below. This is the last virtual page accessed by the simulator, or the last page n button pressed. physical page: the physical page for this virtual page, if any. -1 indicates that no physical page is associated with this virtual page. R: whether this page has been read. (1=yes, 0=no) M: whether this page has been modified. (1=yes, 0=no) inMemTime: number of ns ago the physical page was allocated to this virtual page. lastTouchTime: number of ns ago the physical page was last modified. low: low virtual memory address of the virtual page. high: high virtual memory address of the virtual page. The Command File The command file for the simulator specifies a sequence of memory instructions to be performed. Each instruction is either a memory READ or WRITE operation, and includes a virtual memory address to be read or written. Depending on whether the virtual page for the address is present in physical memory, the operation will succeed, or, if not, a page fault will occur. Operations on Virtual Memory There are two operations one can carry out on pages in memory: READ and WRITE. The format for each command is operation address or operation random where operation is READ or WRITE, and address is the numeric virtual memory address, optionally preceded by one of the radix keywords bin, oct, or hex. If no radix is supplied, the number is assumed to be decimal. The keyword random will generate a random virtual memory address (for those who want to experiment quickly) rather than having to type an address. For example, the sequence READ bin 01010101 WRITE bin 10101010 READ random WRITE random causes the virtual memory manager to: 1. read from virtual memory address 85 2. write to virtual memory address 170 3. read from some random virtual memory address 4. write to some random virtual memory address Sample Command File The "commands" input file looks like this: // Enter READ/WRITE commands into this file // READ // WRITE READ bin 100 READ 19 WRITE hex CC32 READ bin 100000000000000 READ bin 100000000000000 WRITE bin 110000000000001 WRITE random The Configuration File The configuration file memory.conf is used to specify the the initial content of the virtual memory map (which pages of virtual memory are mapped to which pages in physical memory) and provide other configuration information, such as whether operation should be logged to a file. Setting Up the Virtual Memory Map The memset command is used to initialize each entry in the virtual page map. memset is followed by six integer values: 1. The virtual page # to initialize 2. The physical page # associated with this virtual page (-1 if no page assigned) 3. If the page has been read from (R) (0=no, 1=yes) 4. If the page has been modified (M) (0=no, 1=yes) 5. The amount of time the page has been in memory (in ns) 6. The last time the page has been modified (in ns) The first two parameters define the mapping between the virtual page and a physical page, if any. The last four parameters are values that might be used by a page replacement algorithm. For example, memset 34 23 0 0 0 0 specifies that virtual page 34 maps to physical page 23, and that the page has not been read or modified. Note: • Each physical page should be mapped to exactly one virtual page. • The number of virtual pages is fixed at 64 (0..63). • The number of physical pages cannot exceed 64 (0..63). • If a virtual page is not specified by any memset command, it is assumed that the page is not mapped. Other Configuration File Options There are a number of other options which can be specified in the configuration file. These are summarized in the table below. Keyword Values Description enable_logging true false Whether logging of the operations should be enabled. If logging is enabled, then the program writes a one- line message for each READ or WRITE operation. By default, no logging is enabled. See also the log_file option. log_file trace- file- name The name of the file to which log messages should be written. If no filename is given, then log messages are written to stdout. This option has no effect if enable_logging is false or not specified. pagesize n power p The size of the page in bytes as a power of two. This can be given as a decimal number which is a power of two (1, 2, 4, 8, etc.) or as a power of two using the power keyword. The maximum page size is 67108864 or power 26. The default page size is power 26. addressradix n The radix in which numerical values are displayed. The default radix is 2 (binary). You may prefer radix 8 (octal), 10 (decimal), or 16 (hexadecimal). Sample Configuration File The "memory.conf" configuration file looks like this: // memset virt page # physical page # R (read from) M (modified) inMemTime (ns) lastTouchTime (ns) memset 0 0 0 0 0 0 memset 1 1 0 0 0 0 memset 2 2 0 0 0 0 memset 3 3 0 0 0 0 memset 4 4 0 0 0 0 memset 5 5 0 0 0 0 memset 6 6 0 0 0 0 memset 7 7 0 0 0 0 memset 8 8 0 0 0 0 memset 9 9 0 0 0 0 memset 10 10 0 0 0 0 memset 11 11 0 0 0 0 memset 12 12 0 0 0 0 memset 13 13 0 0 0 0 memset 14 14 0 0 0 0 memset 15 15 0 0 0 0 memset 16 16 0 0 0 0 memset 17 17 0 0 0 0 memset 18 18 0 0 0 0 memset 19 19 0 0 0 0 memset 20 20 0 0 0 0 memset 21 21 0 0 0 0 memset 22 22 0 0 0 0 memset 23 23 0 0 0 0 memset 24 24 0 0 0 0 memset 25 25 0 0 0 0 memset 26 26 0 0 0 0 memset 27 27 0 0 0 0 memset 28 28 0 0 0 0 memset 29 29 0 0 0 0 memset 30 30 0 0 0 0 memset 31 31 0 0 0 0 // enable_logging 'true' or 'false' // When true specify a log_file or leave blank for stdout enable_logging true // log_file // Where is the name of the file you want output // to be print to. log_file tracefile // page size, defaults to 2^14 and cannot be greater than 2^26 // pagesize or <'power' num (base 2)>
pagesize 16384

// addressradix sets the radix in which numerical
values are displayed
// 2 is the default value
// addressradix
addressradix 16

// numpages sets the number of pages (physical and
virtual)
// 64 is the default value
// numpages must be at least 2 and no more than 64
// numpages
numpages 64

The Output File
The output file contains a log of the operations since the simulation started (or since
the last reset). It lists the command that was attempted and what happened as a
result. You can review this file after executing the simulation.

The output file contains one line per operation executed. The format of each line is:

command address … status

where:

• command is READ or WRITE,
• address is a number corresponding to a virtual memory address, and
• status is okay or page fault.

Sample Output

The output “tracefile” looks something like this:
READ 4 … okay
READ 13 … okay
WRITE 3acc32 … okay
READ 10000000 … okay
READ 10000000 … okay
WRITE c0001000 … page fault
WRITE 2aeea2ef … okay

Suggested Exercises

1. Create a command file that maps any 8 pages of physical memory to the first
8 pages of virtual memory, and then reads from one virtual memory address
on each of the 64 virtual pages. Step through the simulator one operation at a
time and see if you can predict which virtual memory addresses cause page
faults. What page replacement algorithm is being used?

__MACOSX/Data structures assignment/._MemoryManagementSimulator

Data structures assignment/.DS_Store

__MACOSX/Data structures assignment/._.DS_Store

Data structures assignment/week4d

CI583: Data Structures and Operating Systems

Recursion, and some recursive problems

1 / 22

Outline

1 Recursion

2 The Towers of Hanoi

2 / 22

Recursion

Recursion occurs when an algorithm (or method, or function), A,
requires us to execute A again, repeatedly.

So that the execution of A does not diverge (run forever) we need
to de�ne conditions under which the recursion will end.

We call this the base case: the conditions under which the
recursion continues are called the recursive case.

Recursion is closely linked to mathematical induction.

3 / 22

Recursion

Any iterative algorithm can be expressed via recursion, and vice
versa.

There are (Turing complete) programming languages which don’t
include any construct for looping, such as Haskell.

Sometimes an iterative solution is the clearest, but recursive code is
often more concise and expresses the solution to the problem
elegantly and directly.

Recursion and iteration are more general constructs than the
algorithmic strategies we considered in the last lecture (greedy
algorithms, dynamic programming, etc).

4 / 22

Recursion

We have seen recursion in OO style, when traversing tree structures:

1 class Node extends Tree {

2 void traverse () {

3 left.traverse ();

4 this.visit ();

5 right.traverse ();

6 }

7 }

8 class Leaf extends Tree {

9 void traverse () {

10 this.visit();

11 }

12 }

5 / 22

Recursion

We can also use it directly within a single method. A simple (naive,
in fact) way to compute the nth triangular number (the nth
triangular number is found by adding n to the (n − 1)th, so the
sequence runs [1, 3, 6, 10 …]):

1 int triangle(int n) {

2 if (n == 1) return 1;

3 else return n + triangle(n – 1);

4 }

The recursive call should normally be made with smaller input, so
that we know the algorithm will terminate. In some recursive
algorithms the input might grow before it shrinks but, obviously,
shrink it must.

6 / 22

Recursion

The reason triangle can be considered naive is that it will use
much more memory at runtime than an iterative solution. In fact, it
could cause stack over�ow errors for even fairly small values of n.
Look at what happens when we run it:

1 triangle (5)

2 if (5==1) 1 else 5 + triangle (5-1)

3 if (false) 1 else 5 + triangle (5-1)

4 5 + triangle (4)

5 5 + (if (4==1) 1 else 4 + triangle (4-1))

6 5 + (4 + (if (3==1) 1 else 3 + triangle (3-1)))

7 5 + (4 + (3 + (if (2==1) 1 else 2 + triangle (2-1))))

8 5 + (4 + (3 + (2 + (if (1==1) 1 else 1 + triangle (1-1)

))))

9 5 + (4 + (3 + (2 + 1)))

10 15

7 / 22

Recursion

The solution to this is to make triangle tail-recursive, meaning
that the recursive call is the last thing the method does.

In this way, we don’t need to keep intermediate values hanging
around.

Fortunately, every non-tail-recursive algorithm can be rewritten to a
tail-recursive version.

Unfortunately, the resulting code is often much less intuitive.

8 / 22

Recursion

In terms of the algorithmic strategies from the last lecture,
recursion is particular important in divide-and-conquer strategies.

These are strategies in which we divide the problem into two parts
to solved separately (or discard one of them).

An example would be Binary Search, though the algorithm we gave
in week 1 was iterative.

We will look at two e�cient sorting algorithms that use a
divide-and-conquer approach, MergeSort and QuickSort.

9 / 22

The Towers of Hanoi

The Towers of Hanoi is an ancient puzzle. Move the discs from the
�rst peg to the last. You may only move one disc at a time, and you
can never put a disc on top of a smaller one. Solving it manually, a
pattern soon emerges � see the applet on studentcentral.

10 / 22

The Towers of Hanoi

We will call the set of all discs, in their right order, the stack.
(Nothing to do with the data structure.)

A B C

11 / 22

The Towers of Hanoi

If you solve the problem manually you will �nd that intermediate,
smaller substacks form during the process of solving the puzzle.
The only way to transfer disc 4 to tower C is to move everything
on top of it out of the way!

4 3

A B C

12 / 22

The Towers of Hanoi

A recursive solution to move n discs from a source tower, S, to a
destination tower, D, via an intermediate tower, I:

1 Move the substack of n − 1 discs from S to I.
2 Move the largest disc from S to D.

3 Move the substack from I to D.

(Recursive de�nitions feel like cheating sometimes!) When we
begin, n = 4, S = A, I = B and D = C.

13 / 22

The Towers of Hanoi

A B C

14 / 22

The Towers of Hanoi

4 3

A B C

15 / 22

The Towers of Hanoi

A B C

16 / 22

The Towers of Hanoi

A B C

17 / 22

The Towers of Hanoi

But we can’t move the substack of discs [1,2,3] all at once. Still,
it’s easier than moving 4 discs…

In fact, we can move the three-disc substack from A to B in the
same way as moving the entire stack but for di�erent values of S, I
and D: move substack [1,2] to intermediate tower C, then 3 to
destination tower B, then substack [1,2] from C to B.

The problem is getting smaller…

18 / 22

The Towers of Hanoi

How do we move the substack [1,2] from A to C?

This one is quite easy: move 1 to B, then 2 to C, then 1 to C.

This is the base case: note that we have said exactly what we will
do without hand-waving like �move the substack…�

19 / 22

The Towers of Hanoi

A recursive solution in Java:

1 class TowersApp {

2 static int nDiscs = 3;

4 public static void main(String [] args) {

5 doTowers(nDiscs , ‘A’, ‘B’, ‘C’);

6 }

7 //…

8 }

20 / 22

The Towers of Hanoi

A recursive solution in Java:

1 class TowersApp {

2 //…

3 public static void doTowers(int n, char from , char

inter , char to) {

4 if (n==1) {

5 System.out.printf(“Disc 1 from %c to %c\n”, from

, to);

6 } else {

7 doTowers(n-1, from , to, inter);//swap from and

inter

8 System.out.printf(“Disc %d from %c to %c\n”, n,

from , to);

9 doTowers(n-1, inter , from , to);//swap inter and

10 }

11 }

12 }

21 / 22

The Towers of Hanoi

It seems astonishing at �rst that a problem that seems like it should
be quite complicated can be solved with so little code!

This is certainly a case of the recursive solution being elegant and
concise.

We can of course produce an iterative solution to the Towers of
Hanoi problem, but it is longer and less clear (to my mind) what is
going on.

22 / 22

Recursion
The Towers of Hanoi

__MACOSX/Data structures assignment/._week4d

Data structures assignment/week9c

Physical media for external storage Optimisations

CI583: Data Structures and Operating Systems
File systems: improvements over the early

systems

1 / 15

Physical media for external storage Optimisations

Outline

1 Physical media for external storage

2 Optimisations

2 / 15

Physical media for external storage Optimisations

Media

So far, we have been ignoring the issue of the actual media that
data blocks are stored on.

We have been assuming that any block of data can be retrieved
with the same cost (O(1)), as if we were accessing locations in a
big array.

For a solid-state drive (SSD) this is more or less true.

3 / 15

Physical media for external storage Optimisations

Media

However, if our data is stored on a (mechanical) hard disk drive
(HDD) this is very far from being the case. The fact that S5FS
also makes this simplifying assumption is one of the reasons for its
poor performance.

Image copyright http://www.file-recovery.com/

4 / 15

http://www.file-recovery.com/

Physical media for external storage Optimisations

Disk architecture

A typical disk drive consists of several platters each of which has
one or two recording surfaces.

Image copyright http://www.file-recovery.com/

5 / 15

http://www.file-recovery.com/

Physical media for external storage Optimisations

Disk architecture

Each surface is divided into a number of concentric tracks, and
each track is divided into a number of sectors. All the sectors are
the same size, and outer tracks have more sectors.

6 / 15

Physical media for external storage Optimisations

Disk architecture

Data is read and written using a set of read/write heads, one per
surface. The heads are connected to arms that move together
across the surfaces. Only one head is active at any one time. The
set of tracks that is under the heads at any one time is called a
cylinder.

7 / 15

Physical media for external storage Optimisations

Disk architecture

It should be clear that the idea of the cylinder is an important one
– if we take steps to store data on separate surfaces but within the
same cylinder we can switch from one head to another (which we
can assume takes no time at all) without needing to spin the
surfaces. To accommodate this, the position of the heads is offset
from each other.

8 / 15

Physical media for external storage Optimisations

Disk architecture

In order to read or write to a location on disk, the disk controller
does the following:

1 Move the heads over the correct cylinder. This is the seek
time.

2 Rotate the platter until the desired sector is under the head.
This is the rotational latency.

3 Rotate the platter so that the entire sector passes under the
head, in order to read or write data. This is the transfer time.

9 / 15

Physical media for external storage Optimisations

Disk architecture

Rotational latency depends on the rate at which the disk spins
(e.g., 10,000 RPM), and transfer time depends on the spin rate
and the number of sectors per track.

Average seek time is usually the most important factor, in the low
milliseconds in a typical drive – a very long time compared to the
speed at which processors work.

An efficient file system has to take steps to reduce this.

10 / 15

Physical media for external storage Optimisations

Optimisations

With regard to storage media, the two key optimisations the
designer of a file system can make are to reduce seek time and
increase the amount of data transferred.

A straightforward way to reduce seek time is to use buffering.

11 / 15

Physical media for external storage Optimisations

Optimisations

Just as the OS buffers data from the file system (fetches more
than it needs and stores recently accessed data in a cache), disk
controllers use a pre-fetch buffer in which the whole of the most
recently accessed sector is stored.

This will improve latency for reads but not writes.

12 / 15

Physical media for external storage Optimisations

Optimisations

With regard to storage media, the two key optimisations the
designer of a file system can make are to reduce seek time and
increase the amount of data transferred.

Other improvements (used in file systems such as Linux ext2):

1 Increased block size. Helpful, but complex data allocation
strategies are need to avoid excessive fragmentation (see
Doeppner, section 6.1).

2 Reduce seek time by data allocation strategies. Allocate the
next block in a way that takes disk architecture into account –
use as few cylinders as possible.

13 / 15

Physical media for external storage Optimisations

Optimisations

3 Reduce rotational latency by data allocation strategies.
Allocate blocks so that they can be read without repositioning
the heads.

This is not a simple as allocating the blocks sequentially – the
OS reads data one block at a time. So, to access two blocks,
the OS issues the first instruction then waits for an interrupt
to say that the work is complete.

By the time the OS responds by asking for the next block, the
disk has spun past the subsequent block and will need to
complete an entire revolution before the heads are over it
again. So, subsequent blocks are interleaved to give the OS
time to ask for them just as they are approaching the heads.

14 / 15

Physical media for external storage Optimisations

Optimisations

4 Clustering. Allocate blocks in groups, rather than one by one.
ext2 pre-allocates 8 blocks at a time, eventually giving them
up if they are not used or space becomes short.

5 Aggressive caching. If memory is abundant, maintain a very
large cache and pre-fetch entire files. Works well for reading.
For writing, we need to periodically write the cached data
back to disk and maintain a log of updates so that work is not
lost in case of a crash.

15 / 15

Physical media for external storage
Optimisations

__MACOSX/Data structures assignment/._week9c

Data structures assignment/memory.zip

memory/.DS_Store

__MACOSX/memory/._.DS_Store

memory/commands
// Enter READ/WRITE commands into this file
// READ
// WRITE
READ bin 100
READ 19
WRITE hex CC32
READ bin 100000000000000
READ bin 100000000000000
WRITE bin 110000000000001
WRITE random

memory/Common.java

public

class

Common

{

static

public

long
s2l
(

String
s
)

{

long
i
=

0
;

try

{

i
=

Long
.
parseLong
(
s
.
trim
());

}

catch

(
NumberFormatException
nfe
)

{

System
.
out
.
println
(
“NumberFormatException: ”

+
nfe
.
getMessage
());

}

return
i
;

}

static

public

int
s2i
(

String
s
)

{

int
i
=

0
;

try

{

i
=

Integer
.
parseInt
(
s
.
trim
());

}

catch

(
NumberFormatException
nfe
)

{

System
.
out
.
println
(
“NumberFormatException: ”

+
nfe
.
getMessage
());

}

return
i
;

}

static

public

byte
s2b
(

String
s
)

{

int
i
=

0
;

byte
b
=

0
;

try

{

i
=

Integer
.
parseInt
(
s
.
trim
());

}

catch

(
NumberFormatException
nfe
)

{

System
.
out
.
println
(
“NumberFormatException: ”

+
nfe
.
getMessage
());

}

b
=

(
byte
)
i
;

return
b
;

}

public

static

long
randomLong
(

long
MAX
)

{

long
i
=

–
1
;

java
.
util
.
Random
generator
=

new

java
.
util
.
Random
(
System
.
currentTimeMillis
());

while

(
i
>
MAX
||
i
< 0 ) { int intOne = generator . nextInt (); int intTwo = generator . nextInt (); i = ( long ) intOne + intTwo ; } return i ; } } memory/ControlPanel.java memory/ControlPanel.java import java . applet . * ; import java . awt . * ; public class ControlPanel extends Frame { Kernel kernel ; Button runButton = new Button ( "run" ); Button stepButton = new Button ( "step" ); Button resetButton = new Button ( "reset" ); Button exitButton = new Button ( "exit" ); Button b0 = new Button ( "page " + ( 0 )); Button b1 = new Button ( "page " + ( 1 )); Button b2 = new Button ( "page " + ( 2 )); Button b3 = new Button ( "page " + ( 3 )); Button b4 = new Button ( "page " + ( 4 )); Button b5 = new Button ( "page " + ( 5 )); Button b6 = new Button ( "page " + ( 6 )); Button b7 = new Button ( "page " + ( 7 )); Button b8 = new Button ( "page " + ( 8 )); Button b9 = new Button ( "page " + ( 9 )); Button b10 = new Button ( "page " + ( 10 )); Button b11 = new Button ( "page " + ( 11 )); Button b12 = new Button ( "page " + ( 12 )); Button b13 = new Button ( "page " + ( 13 )); Button b14 = new Button ( "page " + ( 14 )); Button b15 = new Button ( "page " + ( 15 )); Button b16 = new Button ( "page " + ( 16 )); Button b17 = new Button ( "page " + ( 17 )); Button b18 = new Button ( "page " + ( 18 )); Button b19 = new Button ( "page " + ( 19 )); Button b20 = new Button ( "page " + ( 20 )); Button b21 = new Button ( "page " + ( 21 )); Button b22 = new Button ( "page " + ( 22 )); Button b23 = new Button ( "page " + ( 23 )); Button b24 = new Button ( "page " + ( 24 )); Button b25 = new Button ( "page " + ( 25 )); Button b26 = new Button ( "page " + ( 26 )); Button b27 = new Button ( "page " + ( 27 )); Button b28 = new Button ( "page " + ( 28 )); Button b29 = new Button ( "page " + ( 29 )); Button b30 = new Button ( "page " + ( 30 )); Button b31 = new Button ( "page " + ( 31 )); Button b32 = new Button ( "page " + ( 32 )); Button b33 = new Button ( "page " + ( 33 )); Button b34 = new Button ( "page " + ( 34 )); Button b35 = new Button ( "page " + ( 35 )); Button b36 = new Button ( "page " + ( 36 )); Button b37 = new Button ( "page " + ( 37 )); Button b38 = new Button ( "page " + ( 38 )); Button b39 = new Button ( "page " + ( 39 )); Button b40 = new Button ( "page " + ( 40 )); Button b41 = new Button ( "page " + ( 41 )); Button b42 = new Button ( "page " + ( 42 )); Button b43 = new Button ( "page " + ( 43 )); Button b44 = new Button ( "page " + ( 44 )); Button b45 = new Button ( "page " + ( 45 )); Button b46 = new Button ( "page " + ( 46 )); Button b47 = new Button ( "page " + ( 47 )); Button b48 = new Button ( "page " + ( 48 )); Button b49 = new Button ( "page " + ( 49 )); Button b50 = new Button ( "page " + ( 50 )); Button b51 = new Button ( "page " + ( 51 )); Button b52 = new Button ( "page " + ( 52 )); Button b53 = new Button ( "page " + ( 53 )); Button b54 = new Button ( "page " + ( 54 )); Button b55 = new Button ( "page " + ( 55 )); Button b56 = new Button ( "page " + ( 56 )); Button b57 = new Button ( "page " + ( 57 )); Button b58 = new Button ( "page " + ( 58 )); Button b59 = new Button ( "page " + ( 59 )); Button b60 = new Button ( "page " + ( 60 )); Button b61 = new Button ( "page " + ( 61 )); Button b62 = new Button ( "page " + ( 62 )); Button b63 = new Button ( "page " + ( 63 )); Label statusValueLabel = new Label ( "STOP" , Label . LEFT ) ; Label timeValueLabel = new Label ( "0" , Label . LEFT ) ; Label instructionValueLabel = new Label ( "NONE" , Label . LEFT ) ; Label addressValueLabel = new Label ( "NULL" , Label . LEFT ) ; Label pageFaultValueLabel = new Label ( "NO" , Label . LEFT ) ; Label virtualPageValueLabel = new Label ( "x" , Label . LEFT ) ; Label physicalPageValueLabel = new Label ( "0" , Label . LEFT ) ; Label RValueLabel = new Label ( "0" , Label . LEFT ) ; Label MValueLabel = new Label ( "0" , Label . LEFT ) ; Label inMemTimeValueLabel = new Label ( "0" , Label . LEFT ) ; Label lastTouchTimeValueLabel = new Label ( "0" , Label . LEFT ) ; Label lowValueLabel = new Label ( "0" , Label . LEFT ) ; Label highValueLabel = new Label ( "0" , Label . LEFT ) ; Label l0 = new Label ( null , Label . CENTER ); Label l1 = new Label ( null , Label . CENTER ); Label l2 = new Label ( null , Label . CENTER ); Label l3 = new Label ( null , Label . CENTER ); Label l4 = new Label ( null , Label . CENTER ); Label l5 = new Label ( null , Label . CENTER ); Label l6 = new Label ( null , Label . CENTER ); Label l7 = new Label ( null , Label . CENTER ); Label l8 = new Label ( null , Label . CENTER ); Label l9 = new Label ( null , Label . CENTER ); Label l10 = new Label ( null , Label . CENTER ); Label l11 = new Label ( null , Label . CENTER ); Label l12 = new Label ( null , Label . CENTER ); Label l13 = new Label ( null , Label . CENTER ); Label l14 = new Label ( null , Label . CENTER ); Label l15 = new Label ( null , Label . CENTER ); Label l16 = new Label ( null , Label . CENTER ); Label l17 = new Label ( null , Label . CENTER ); Label l18 = new Label ( null , Label . CENTER ); Label l19 = new Label ( null , Label . CENTER ); Label l20 = new Label ( null , Label . CENTER ); Label l21 = new Label ( null , Label . CENTER ); Label l22 = new Label ( null , Label . CENTER ); Label l23 = new Label ( null , Label . CENTER ); Label l24 = new Label ( null , Label . CENTER ); Label l25 = new Label ( null , Label . CENTER ); Label l26 = new Label ( null , Label . CENTER ); Label l27 = new Label ( null , Label . CENTER ); Label l28 = new Label ( null , Label . CENTER ); Label l29 = new Label ( null , Label . CENTER ); Label l30 = new Label ( null , Label . CENTER ); Label l31 = new Label ( null , Label . CENTER ); Label l32 = new Label ( null , Label . CENTER ); Label l33 = new Label ( null , Label . CENTER ); Label l34 = new Label ( null , Label . CENTER ); Label l35 = new Label ( null , Label . CENTER ); Label l36 = new Label ( null , Label . CENTER ); Label l37 = new Label ( null , Label . CENTER ); Label l38 = new Label ( null , Label . CENTER ); Label l39 = new Label ( null , Label . CENTER ); Label l40 = new Label ( null , Label . CENTER ); Label l41 = new Label ( null , Label . CENTER ); Label l42 = new Label ( null , Label . CENTER ); Label l43 = new Label ( null , Label . CENTER ); Label l44 = new Label ( null , Label . CENTER ); Label l45 = new Label ( null , Label . CENTER ); Label l46 = new Label ( null , Label . CENTER ); Label l47 = new Label ( null , Label . CENTER ); Label l48 = new Label ( null , Label . CENTER ); Label l49 = new Label ( null , Label . CENTER ); Label l50 = new Label ( null , Label . CENTER ); Label l51 = new Label ( null , Label . CENTER ); Label l52 = new Label ( null , Label . CENTER ); Label l53 = new Label ( null , Label . CENTER ); Label l54 = new Label ( null , Label . CENTER ); Label l55 = new Label ( null , Label . CENTER ); Label l56 = new Label ( null , Label . CENTER ); Label l57 = new Label ( null , Label . CENTER ); Label l58 = new Label ( null , Label . CENTER ); Label l59 = new Label ( null , Label . CENTER ); Label l60 = new Label ( null , Label . CENTER ); Label l61 = new Label ( null , Label . CENTER ); Label l62 = new Label ( null , Label . CENTER ); Label l63 = new Label ( null , Label . CENTER ); public ControlPanel () { super (); } public ControlPanel ( String title ) { super ( title ); } public void init ( Kernel useKernel , String commands , String config ) { kernel = useKernel ; kernel . setControlPanel ( this ); setLayout ( null ); setBackground ( Color . white ); setForeground ( Color . black ); resize ( 635 , 545 ); setFont ( new Font ( "Courier" , 0 , 12 ) ); runButton . setForeground ( Color . blue ); runButton . setBackground ( Color . lightGray ); runButton . reshape ( 0 , 25 , 70 , 15 ); add ( runButton ); stepButton . setForeground ( Color . blue ); stepButton . setBackground ( Color . lightGray ); stepButton . reshape ( 70 , 25 , 70 , 15 ); add ( stepButton ); resetButton . setForeground ( Color . blue ); resetButton . setBackground ( Color . lightGray ); resetButton . reshape ( 140 , 25 , 70 , 15 ); add ( resetButton ); exitButton . setForeground ( Color . blue ); exitButton . setBackground ( Color . lightGray ); exitButton . reshape ( 210 , 25 , 70 , 15 ); add ( exitButton ); b0 . reshape ( 0 , ( 0 + 2 ) * 15 + 25 , 70 , 15 ); b0 . setForeground ( Color . magenta ); b0 . setBackground ( Color . lightGray ); add ( b0 ); b1 . reshape ( 0 , ( 1 + 2 ) * 15 + 25 , 70 , 15 ); b1 . setForeground ( Color . magenta ); b1 . setBackground ( Color . lightGray ); add ( b1 ); b2 . reshape ( 0 , ( 2 + 2 ) * 15 + 25 , 70 , 15 ); b2 . setForeground ( Color . magenta ); b2 . setBackground ( Color . lightGray ); add ( b2 ); b3 . reshape ( 0 , ( 3 + 2 ) * 15 + 25 , 70 , 15 ); b3 . setForeground ( Color . magenta ); b3 . setBackground ( Color . lightGray ); add ( b3 ); b4 . reshape ( 0 , ( 4 + 2 ) * 15 + 25 , 70 , 15 ); b4 . setForeground ( Color . magenta ); b4 . setBackground ( Color . lightGray ); add ( b4 ); b5 . reshape ( 0 , ( 5 + 2 ) * 15 + 25 , 70 , 15 ); b5 . setForeground ( Color . magenta ); b5 . setBackground ( Color . lightGray ); add ( b5 ); b6 . reshape ( 0 , ( 6 + 2 ) * 15 + 25 , 70 , 15 ); b6 . setForeground ( Color . magenta ); b6 . setBackground ( Color . lightGray ); add ( b6 ); b7 . reshape ( 0 , ( 7 + 2 ) * 15 + 25 , 70 , 15 ); b7 . setForeground ( Color . magenta ); b7 . setBackground ( Color . lightGray ); add ( b7 ); b8 . reshape ( 0 , ( 8 + 2 ) * 15 + 25 , 70 , 15 ); b8 . setForeground ( Color . magenta ); b8 . setBackground ( Color . lightGray ); add ( b8 ); b9 . reshape ( 0 , ( 9 + 2 ) * 15 + 25 , 70 , 15 ); b9 . setForeground ( Color . magenta ); b9 . setBackground ( Color . lightGray ); add ( b9 ); b10 . reshape ( 0 , ( 10 + 2 ) * 15 + 25 , 70 , 15 ); b10 . setForeground ( Color . magenta ); b10 . setBackground ( Color . lightGray ); add ( b10 ); b11 . reshape ( 0 , ( 11 + 2 ) * 15 + 25 , 70 , 15 ); b11 . setForeground ( Color . magenta ); b11 . setBackground ( Color . lightGray ); add ( b11 ); b12 . reshape ( 0 , ( 12 + 2 ) * 15 + 25 , 70 , 15 ); b12 . setForeground ( Color . magenta ); b12 . setBackground ( Color . lightGray ); add ( b12 ); b13 . reshape ( 0 , ( 13 + 2 ) * 15 + 25 , 70 , 15 ); b13 . setForeground ( Color . magenta ); b13 . setBackground ( Color . lightGray ); add ( b13 ); b14 . reshape ( 0 , ( 14 + 2 ) * 15 + 25 , 70 , 15 ); b14 . setForeground ( Color . magenta ); b14 . setBackground ( Color . lightGray ); add ( b14 ); b15 . reshape ( 0 , ( 15 + 2 ) * 15 + 25 , 70 , 15 ); b15 . setForeground ( Color . magenta ); b15 . setBackground ( Color . lightGray ); add ( b15 ); b16 . reshape ( 0 , ( 16 + 2 ) * 15 + 25 , 70 , 15 ); b16 . setForeground ( Color . magenta ); b16 . setBackground ( Color . lightGray ); add ( b16 ); b17 . reshape ( 0 , ( 17 + 2 ) * 15 + 25 , 70 , 15 ); b17 . setForeground ( Color . magenta ); b17 . setBackground ( Color . lightGray ); add ( b17 ); b18 . reshape ( 0 , ( 18 + 2 ) * 15 + 25 , 70 , 15 ); b18 . setForeground ( Color . magenta ); b18 . setBackground ( Color . lightGray ); add ( b18 ); b19 . reshape ( 0 , ( 19 + 2 ) * 15 + 25 , 70 , 15 ); b19 . setForeground ( Color . magenta ); b19 . setBackground ( Color . lightGray ); add ( b19 ); b20 . reshape ( 0 , ( 20 + 2 ) * 15 + 25 , 70 , 15 ); b20 . setForeground ( Color . magenta ); b20 . setBackground ( Color . lightGray ); add ( b20 ); b21 . reshape ( 0 , ( 21 + 2 ) * 15 + 25 , 70 , 15 ); b21 . setForeground ( Color . magenta ); b21 . setBackground ( Color . lightGray ); add ( b21 ); b22 . reshape ( 0 , ( 22 + 2 ) * 15 + 25 , 70 , 15 ); b22 . setForeground ( Color . magenta ); b22 . setBackground ( Color . lightGray ); add ( b22 ); b23 . reshape ( 0 , ( 23 + 2 ) * 15 + 25 , 70 , 15 ); b23 . setForeground ( Color . magenta ); b23 . setBackground ( Color . lightGray ); add ( b23 ); b24 . reshape ( 0 , ( 24 + 2 ) * 15 + 25 , 70 , 15 ); b24 . setForeground ( Color . magenta ); b24 . setBackground ( Color . lightGray ); add ( b24 ); b25 . reshape ( 0 , ( 25 + 2 ) * 15 + 25 , 70 , 15 ); b25 . setForeground ( Color . magenta ); b25 . setBackground ( Color . lightGray ); add ( b25 ); b26 . reshape ( 0 , ( 26 + 2 ) * 15 + 25 , 70 , 15 ); b26 . setForeground ( Color . magenta ); b26 . setBackground ( Color . lightGray ); add ( b26 ); b27 . reshape ( 0 , ( 27 + 2 ) * 15 + 25 , 70 , 15 ); b27 . setForeground ( Color . magenta ); b27 . setBackground ( Color . lightGray ); add ( b27 ); b28 . reshape ( 0 , ( 28 + 2 ) * 15 + 25 , 70 , 15 ); b28 . setForeground ( Color . magenta ); b28 . setBackground ( Color . lightGray ); add ( b28 ); b29 . reshape ( 0 , ( 29 + 2 ) * 15 + 25 , 70 , 15 ); b29 . setForeground ( Color . magenta ); b29 . setBackground ( Color . lightGray ); add ( b29 ); b30 . reshape ( 0 , ( 30 + 2 ) * 15 + 25 , 70 , 15 ); b30 . setForeground ( Color . magenta ); b30 . setBackground ( Color . lightGray ); add ( b30 ); b31 . reshape ( 0 , ( 31 + 2 ) * 15 + 25 , 70 , 15 ); b31 . setForeground ( Color . magenta ); b31 . setBackground ( Color . lightGray ); add ( b31 ); b32 . reshape ( 140 , ( 0 + 2 ) * 15 + 25 , 70 , 15 ); b32 . setForeground ( Color . magenta ); b32 . setBackground ( Color . lightGray ); add ( b32 ); b33 . reshape ( 140 , ( 1 + 2 ) * 15 + 25 , 70 , 15 ); b33 . setForeground ( Color . magenta ); b33 . setBackground ( Color . lightGray ); add ( b33 ); b34 . reshape ( 140 , ( 2 + 2 ) * 15 + 25 , 70 , 15 ); b34 . setForeground ( Color . magenta ); b34 . setBackground ( Color . lightGray ); add ( b34 ); b35 . reshape ( 140 , ( 3 + 2 ) * 15 + 25 , 70 , 15 ); b35 . setForeground ( Color . magenta ); b35 . setBackground ( Color . lightGray ); add ( b35 ); b36 . reshape ( 140 , ( 4 + 2 ) * 15 + 25 , 70 , 15 ); b36 . setForeground ( Color . magenta ); b36 . setBackground ( Color . lightGray ); add ( b36 ); b37 . reshape ( 140 , ( 5 + 2 ) * 15 + 25 , 70 , 15 ); b37 . setForeground ( Color . magenta ); b37 . setBackground ( Color . lightGray ); add ( b37 ); b38 . reshape ( 140 , ( 6 + 2 ) * 15 + 25 , 70 , 15 ); b38 . setForeground ( Color . magenta ); b38 . setBackground ( Color . lightGray ); add ( b38 ); b39 . reshape ( 140 , ( 7 + 2 ) * 15 + 25 , 70 , 15 ); b39 . setForeground ( Color . magenta ); b39 . setBackground ( Color . lightGray ); add ( b39 ); b40 . reshape ( 140 , ( 8 + 2 ) * 15 + 25 , 70 , 15 ); b40 . setForeground ( Color . magenta ); b40 . setBackground ( Color . lightGray ); add ( b40 ); b41 . reshape ( 140 , ( 9 + 2 ) * 15 + 25 , 70 , 15 ); b41 . setForeground ( Color . magenta ); b41 . setBackground ( Color . lightGray ); add ( b41 ); b42 . reshape ( 140 , ( 10 + 2 ) * 15 + 25 , 70 , 15 ); b42 . setForeground ( Color . magenta ); b42 . setBackground ( Color . lightGray ); add ( b42 ); b43 . reshape ( 140 , ( 11 + 2 ) * 15 + 25 , 70 , 15 ); b43 . setForeground ( Color . magenta ); b43 . setBackground ( Color . lightGray ); add ( b43 ); b44 . reshape ( 140 , ( 12 + 2 ) * 15 + 25 , 70 , 15 ); b44 . setForeground ( Color . magenta ); b44 . setBackground ( Color . lightGray ); add ( b44 ); b45 . reshape ( 140 , ( 13 + 2 ) * 15 + 25 , 70 , 15 ); b45 . setForeground ( Color . magenta ); b45 . setBackground ( Color . lightGray ); add ( b45 ); b46 . reshape ( 140 , ( 14 + 2 ) * 15 + 25 , 70 , 15 ); b46 . setForeground ( Color . magenta ); b46 . setBackground ( Color . lightGray ); add ( b46 ); b47 . reshape ( 140 , ( 15 + 2 ) * 15 + 25 , 70 , 15 ); b47 . setForeground ( Color . magenta ); b47 . setBackground ( Color . lightGray ); add ( b47 ); b48 . reshape ( 140 , ( 16 + 2 ) * 15 + 25 , 70 , 15 ); b48 . setForeground ( Color . magenta ); b48 . setBackground ( Color . lightGray ); add ( b48 ); b49 . reshape ( 140 , ( 17 + 2 ) * 15 + 25 , 70 , 15 ); b49 . setForeground ( Color . magenta ); b49 . setBackground ( Color . lightGray ); add ( b49 ); b50 . reshape ( 140 , ( 18 + 2 ) * 15 + 25 , 70 , 15 ); b50 . setForeground ( Color . magenta ); b50 . setBackground ( Color . lightGray ); add ( b50 ); b51 . reshape ( 140 , ( 19 + 2 ) * 15 + 25 , 70 , 15 ); b51 . setForeground ( Color . magenta ); b51 . setBackground ( Color . lightGray ); add ( b51 ); b52 . reshape ( 140 , ( 20 + 2 ) * 15 + 25 , 70 , 15 ); b52 . setForeground ( Color . magenta ); b52 . setBackground ( Color . lightGray ); add ( b52 ); b53 . reshape ( 140 , ( 21 + 2 ) * 15 + 25 , 70 , 15 ); b53 . setForeground ( Color . magenta ); b53 . setBackground ( Color . lightGray ); add ( b53 ); b54 . reshape ( 140 , ( 22 + 2 ) * 15 + 25 , 70 , 15 ); b54 . setForeground ( Color . magenta ); b54 . setBackground ( Color . lightGray ); add ( b54 ); b55 . reshape ( 140 , ( 23 + 2 ) * 15 + 25 , 70 , 15 ); b55 . setForeground ( Color . magenta ); b55 . setBackground ( Color . lightGray ); add ( b55 ); b56 . reshape ( 140 , ( 24 + 2 ) * 15 + 25 , 70 , 15 ); b56 . setForeground ( Color . magenta ); b56 . setBackground ( Color . lightGray ); add ( b56 ); b57 . reshape ( 140 , ( 25 + 2 ) * 15 + 25 , 70 , 15 ); b57 . setForeground ( Color . magenta ); b57 . setBackground ( Color . lightGray ); add ( b57 ); b58 . reshape ( 140 , ( 26 + 2 ) * 15 + 25 , 70 , 15 ); b58 . setForeground ( Color . magenta ); b58 . setBackground ( Color . lightGray ); add ( b58 ); b59 . reshape ( 140 , ( 27 + 2 ) * 15 + 25 , 70 , 15 ); b59 . setForeground ( Color . magenta ); b59 . setBackground ( Color . lightGray ); add ( b59 ); b60 . reshape ( 140 , ( 28 + 2 ) * 15 + 25 , 70 , 15 ); b60 . setForeground ( Color . magenta ); b60 . setBackground ( Color . lightGray ); add ( b60 ); b61 . reshape ( 140 , ( 29 + 2 ) * 15 + 25 , 70 , 15 ); b61 . setForeground ( Color . magenta ); b61 . setBackground ( Color . lightGray ); add ( b61 ); b62 . reshape ( 140 , ( 30 + 2 ) * 15 + 25 , 70 , 15 ); b62 . setForeground ( Color . magenta ); b62 . setBackground ( Color . lightGray ); add ( b62 ); b63 . reshape ( 140 , ( 31 + 2 ) * 15 + 25 , 70 , 15 ); b63 . setForeground ( Color . magenta ); b63 . setBackground ( Color . lightGray ); add ( b63 ); statusValueLabel . reshape ( 345 , 0 + 25 , 100 , 15 ); add ( statusValueLabel ); timeValueLabel . reshape ( 345 , 15 + 25 , 100 , 15 ); add( timeValueLabel ); instructionValueLabel.reshape( 385,45+25,100,15 ); add( instructionValueLabel ); addressValueLabel.reshape(385,60+25,230,15); add( addressValueLabel ); pageFaultValueLabel.reshape( 385,90+25,100,15 ); add( pageFaultValueLabel ); virtualPageValueLabel.reshape( 395,120+25,200,15 ); add( virtualPageValueLabel ); physicalPageValueLabel.reshape( 395,135+25,200,15 ); add( physicalPageValueLabel ); RValueLabel.reshape( 395,150+25,200,15 ); add( RValueLabel ); MValueLabel.reshape( 395,165+25,200,15 ); add( MValueLabel ); inMemTimeValueLabel.reshape(395,180+25,200,15 ); add( inMemTimeValueLabel ); lastTouchTimeValueLabel.reshape( 395,195+25,200,15 ); add( lastTouchTimeValueLabel ); lowValueLabel.reshape( 395,210+25,230,15 ); add( lowValueLabel ); highValueLabel.reshape( 395,225+25,230,15 ); add( highValueLabel ); Label virtualOneLabel = new Label( "virtual" , Label.CENTER) ; virtualOneLabel.reshape(0,15+25,70,15); add(virtualOneLabel); Label virtualTwoLabel = new Label( "virtual" , Label.CENTER) ; virtualTwoLabel.reshape(140,15+25,70,15); add(virtualTwoLabel); Label physicalOneLabel = new Label( "physical" , Label.CENTER) ; physicalOneLabel.reshape(70,15+25,70,15); add(physicalOneLabel); Label physicalTwoLabel = new Label( "physical" , Label.CENTER) ; physicalTwoLabel.reshape(210,15+25,70,15); add(physicalTwoLabel); Label statusLabel = new Label("status: " , Label.LEFT) ; statusLabel.reshape(285,0+25,65,15); add(statusLabel); Label timeLabel = new Label("time: " , Label.LEFT) ; timeLabel.reshape(285,15+25,50,15); add(timeLabel); Label instructionLabel = new Label("instruction: " , Label.LEFT) ; instructionLabel.reshape(285,45+25,100,15); add(instructionLabel); Label addressLabel = new Label("address: " , Label.LEFT) ; addressLabel.reshape(285,60+25,85,15); add(addressLabel); Label pageFaultLabel = new Label("page fault: " , Label.LEFT) ; pageFaultLabel.reshape(285,90+25,100,15); add(pageFaultLabel); Label virtualPageLabel = new Label("virtual page: " , Label.LEFT) ; virtualPageLabel.reshape(285,120+25,110,15); add(virtualPageLabel); Label physicalPageLabel = new Label("physical page: " , Label.LEFT) ; physicalPageLabel.reshape(285,135+25,110,15); add(physicalPageLabel); Label RLabel = new Label("R: ", Label.LEFT) ; RLabel.reshape(285,150+25,110,15); add(RLabel); Label MLabel = new Label("M: " , Label.LEFT) ; MLabel.reshape(285,165+25,110,15); add(MLabel); Label inMemTimeLabel = new Label("inMemTime: " , Label.LEFT) ; inMemTimeLabel.reshape(285,180+25,110,15); add(inMemTimeLabel); Label lastTouchTimeLabel = new Label("lastTouchTime: " , Label.LEFT) ; lastTouchTimeLabel.reshape(285,195+25,110,15); add(lastTouchTimeLabel); Label lowLabel = new Label("low: " , Label.LEFT) ; lowLabel.reshape(285,210+25,110,15); add(lowLabel); Label highLabel = new Label("high: " , Label.LEFT) ; highLabel.reshape(285,225+25,110,15); add(highLabel); l0.reshape( 70, (2)*15+25, 60, 15 ); l0.setForeground( Color.red ); l0.setFont( new Font( "Courier", 0, 10 ) ); add( l0 ); l1.reshape( 70, (3)*15+25, 60, 15 ); l1.setForeground( Color.red ); l1.setFont( new Font( "Courier", 0, 10 ) ); add( l1 ); l2.reshape( 70, (4)*15+25, 60, 15 ); l2.setForeground( Color.red ); l2.setFont( new Font( "Courier", 0, 10 ) ); add( l2 ); l3.reshape( 70, (5)*15+25, 60, 15 ); l3.setForeground( Color.red ); l3.setFont( new Font( "Courier", 0, 10 ) ); add( l3 ); l4.reshape( 70, (6)*15+25, 60, 15 ); l4.setForeground( Color.red ); l4.setFont( new Font( "Courier", 0, 10 ) ); add( l4 ); l5.reshape( 70, (7)*15+25, 60, 15 ); l5.setForeground( Color.red ); l5.setFont( new Font( "Courier", 0, 10 ) ); add( l5 ); l6.reshape( 70, (8)*15+25, 60, 15 ); l6.setForeground( Color.red ); l6.setFont( new Font( "Courier", 0, 10 ) ); add( l6 ); l7.reshape( 70, (9)*15+25, 60, 15 ); l7.setForeground( Color.red ); l7.setFont( new Font( "Courier", 0, 10 ) ); add( l7 ); l8.reshape( 70, (10)*15+25, 60, 15 ); l8.setForeground( Color.red ); l8.setFont( new Font( "Courier", 0, 10 ) ); add( l8 ); l9.reshape( 70, (11)*15+25, 60, 15 ); l9.setForeground( Color.red ); l9.setFont( new Font( "Courier", 0, 10 ) ); add( l9 ); l10.reshape( 70, (12)*15+25, 60, 15 ); l10.setForeground( Color.red ); l10.setFont( new Font( "Courier", 0, 10 ) ); add( l10 ); l11.reshape( 70, (13)*15+25, 60, 15 ); l11.setForeground( Color.red ); l11.setFont( new Font( "Courier", 0, 10 ) ); add( l11 ); l12.reshape( 70, (14)*15+25, 60, 15 ); l12.setForeground( Color.red ); l12.setFont( new Font( "Courier", 0, 10 ) ); add( l12 ); l13.reshape( 70, (15)*15+25, 60, 15 ); l13.setForeground( Color.red ); l13.setFont( new Font( "Courier", 0, 10 ) ); add( l13 ); l14.reshape( 70, (16)*15+25, 60, 15 ); l14.setForeground( Color.red ); l14.setFont( new Font( "Courier", 0, 10 ) ); add( l14 ); l15.reshape( 70, (17)*15+25, 60, 15 ); l15.setForeground( Color.red ); l15.setFont( new Font( "Courier", 0, 10 ) ); add( l15 ); l16.reshape( 70, (18)*15+25, 60, 15 ); l16.setForeground( Color.red ); l16.setFont( new Font( "Courier", 0, 10 ) ); add( l16 ); l17.reshape( 70, (19)*15+25, 60, 15 ); l17.setForeground( Color.red ); l17.setFont( new Font( "Courier", 0, 10 ) ); add( l17 ); l18.reshape( 70, (20)*15+25, 60, 15 ); l18.setForeground( Color.red ); l18.setFont( new Font( "Courier", 0, 10 ) ); add( l18 ); l19.reshape( 70, (21)*15+25, 60, 15 ); l19.setForeground( Color.red ); l19.setFont( new Font( "Courier", 0, 10 ) ); add( l19 ); l20.reshape( 70, (22)*15+25, 60, 15 ); l20.setForeground( Color.red ); l20.setFont( new Font( "Courier", 0, 10 ) ); add( l20 ); l21.reshape( 70, (23)*15+25, 60, 15 ); l21.setForeground( Color.red ); l21.setFont( new Font( "Courier", 0, 10 ) ); add( l21 ); l22.reshape( 70, (24)*15+25, 60, 15 ); l22.setForeground( Color.red ); l22.setFont( new Font( "Courier", 0, 10 ) ); add( l22 ); l23.reshape( 70, (25)*15+25, 60, 15 ); l23.setForeground( Color.red ); l23.setFont( new Font( "Courier", 0, 10 ) ); add( l23 ); l24.reshape( 70, (26)*15+25, 60, 15 ); l24.setForeground( Color.red ); l24.setFont( new Font( "Courier", 0, 10 ) ); add( l24 ); l25.reshape( 70, (27)*15+25, 60, 15 ); l25.setForeground( Color.red ); l25.setFont( new Font( "Courier", 0, 10 ) ); add( l25 ); l26.reshape( 70, (28)*15+25, 60, 15 ); l26.setForeground( Color.red ); l26.setFont( new Font( "Courier", 0, 10 ) ); add( l26 ); l27.reshape( 70, (29)*15+25, 60, 15 ); l27.setForeground( Color.red ); l27.setFont( new Font( "Courier", 0, 10 ) ); add( l27 ); l28.reshape( 70, (30)*15+25, 60, 15 ); l28.setForeground( Color.red ); l28.setFont( new Font( "Courier", 0, 10 ) ); add( l28 ); l29.reshape( 70, (31)*15+25, 60, 15 ); l29.setForeground( Color.red ); l29.setFont( new Font( "Courier", 0, 10 ) ); add( l29 ); l30.reshape( 70, (32)*15+25, 60, 15 ); l30.setForeground( Color.red ); l30.setFont( new Font( "Courier", 0, 10 ) ); add( l30 ); l31.reshape( 70, (33)*15+25, 60, 15 ); l31.setForeground( Color.red ); l31.setFont( new Font( "Courier", 0, 10 ) ); add( l31 ); l32.reshape( 210, (2)*15+25, 60, 15 ); l32.setForeground( Color.red ); l32.setFont( new Font( "Courier", 0, 10 ) ); add( l32 ); l33.reshape( 210, (3)*15+25, 60, 15 ); l33.setForeground( Color.red ); l33.setFont( new Font( "Courier", 0, 10 ) ); add( l33 ); l34.reshape( 210, (4)*15+25, 60, 15 ); l34.setForeground( Color.red ); l34.setFont( new Font( "Courier", 0, 10 ) ); add( l34 ); l35.reshape( 210, (5)*15+25, 60, 15 ); l35.setForeground( Color.red ); l35.setFont( new Font( "Courier", 0, 10 ) ); add( l35 ); l36.reshape( 210, (6)*15+25, 60, 15 ); l36.setForeground( Color.red ); l36.setFont( new Font( "Courier", 0, 10 ) ); add( l36 ); l37.reshape( 210, (7)*15+25, 60, 15 ); l37.setForeground( Color.red ); l37.setFont( new Font( "Courier", 0, 10 ) ); add( l37 ); l38.reshape( 210, (8)*15+25, 60, 15 ); l38.setForeground( Color.red ); l38.setFont( new Font( "Courier", 0, 10 ) ); add( l38 ); l39.reshape( 210, (9)*15+25, 60, 15 ); l39.setForeground( Color.red ); l39.setFont( new Font( "Courier", 0, 10 ) ); add( l39 ); l40.reshape( 210, (10)*15+25, 60, 15 ); l40.setForeground( Color.red ); l40.setFont( new Font( "Courier", 0, 10 ) ); add( l40 ); l41.reshape( 210, (11)*15+25, 60, 15 ); l41.setForeground( Color.red ); l41.setFont( new Font( "Courier", 0, 10 ) ); add( l41 ); l42.reshape( 210, (12)*15+25, 60, 15 ); l42.setForeground( Color.red ); l42.setFont( new Font( "Courier", 0, 10 ) ); add( l42 ); l43.reshape( 210, (13)*15+25, 60, 15 ); l43.setForeground( Color.red ); l43.setFont( new Font( "Courier", 0, 10 ) ); add( l43 ); l44.reshape( 210, (14)*15+25, 60, 15 ); l44.setForeground( Color.red ); l44.setFont( new Font( "Courier", 0, 10 ) ); add( l44 ); l45.reshape( 210, (15)*15+25, 60, 15 ); l45.setForeground( Color.red ); l45.setFont( new Font( "Courier", 0, 10 ) ); add( l45 ); l46.reshape( 210, (16)*15+25, 60, 15 ); l46.setForeground( Color.red ); l46.setFont( new Font( "Courier", 0, 10 ) ); add( l46 ); l47.reshape( 210, (17)*15+25, 60, 15 ); l47.setForeground( Color.red ); l47.setFont( new Font( "Courier", 0, 10 ) ); add( l47 ); l48.reshape( 210, (18)*15+25, 60, 15 ); l48.setForeground( Color.red ); l48.setFont( new Font( "Courier", 0, 10 ) ); add( l48 ); l49.reshape( 210, (19)*15+25, 60, 15 ); l49.setForeground( Color.red ); l49.setFont( new Font( "Courier", 0, 10 ) ); add( l49 ); l50.reshape( 210, (20)*15+25, 60, 15 ); l50.setForeground( Color.red ); l50.setFont( new Font( "Courier", 0, 10 ) ); add( l50 ); l51.reshape( 210, (21)*15+25, 60, 15 ); l51.setForeground( Color.red ); l51.setFont( new Font( "Courier", 0, 10 ) ); add( l51 ); l52.reshape( 210, (22)*15+25, 60, 15 ); l52.setForeground( Color.red ); l52.setFont( new Font( "Courier", 0, 10 ) ); add( l52 ); l53.reshape( 210, (23)*15+25, 60, 15 ); l53.setForeground( Color.red ); l53.setFont( new Font( "Courier", 0, 10 ) ); add( l53 ); l54.reshape( 210, (24)*15+25, 60, 15 ); l54.setForeground( Color.red ); l54.setFont( new Font( "Courier", 0, 10 ) ); add( l54 ); l55.reshape( 210, (25)*15+25, 60, 15 ); l55.setForeground( Color.red ); l55.setFont( new Font( "Courier", 0, 10 ) ); add( l55 ); l56.reshape( 210, (26)*15+25, 60, 15 ); l56.setForeground( Color.red ); l56.setFont( new Font( "Courier", 0, 10 ) ); add( l56 ); l57.reshape( 210, (27)*15+25, 60, 15 ); l57.setForeground( Color.red ); l57.setFont( new Font( "Courier", 0, 10 ) ); add( l57 ); l58.reshape( 210, (28)*15+25, 60, 15 ); l58.setForeground( Color.red ); l58.setFont( new Font( "Courier", 0, 10 ) ); add( l58 ); l59.reshape( 210, (29)*15+25, 60, 15 ); l59.setForeground( Color.red ); l59.setFont( new Font( "Courier", 0, 10 ) ); add( l59 ); l60.reshape( 210, (30)*15+25, 60, 15 ); l60.setForeground( Color.red ); l60.setFont( new Font( "Courier", 0, 10 ) ); add( l60 ); l61.reshape( 210, (31)*15+25, 60, 15 ); l61.setForeground( Color.red ); l61.setFont( new Font( "Courier", 0, 10 ) ); add( l61 ); l62.reshape( 210, (32)*15+25, 60, 15 ); l62.setForeground( Color.red ); l62.setFont( new Font( "Courier", 0, 10 ) ); add( l62 ); l63.reshape( 210, (33)*15+25, 60, 15 ); l63.setForeground( Color.red ); l63.setFont( new Font( "Courier", 0, 10 ) ); add( l63 ); kernel.init( commands , config ); show(); } public void paintPage( Page page ) { virtualPageValueLabel.setText( Integer.toString( page.id ) ); physicalPageValueLabel.setText( Integer.toString( page.physical ) ); RValueLabel.setText( Integer.toString( page.R ) ); MValueLabel.setText( Integer.toString( page.M ) ); inMemTimeValueLabel.setText( Integer.toString( page.inMemTime ) ); lastTouchTimeValueLabel.setText( Integer.toString( page.lastTouchTime ) ); lowValueLabel.setText(Long.toString( page.low , Kernel.addressradix ) ); highValueLabel.setText(Long.toString( page.high , Kernel.addressradix ) ); } public void setStatus(String status) { statusValueLabel.setText(status); } public void addPhysicalPage( int pageNum , int physicalPage ) { if ( physicalPage == 0 ) { l0.setText( "page " + pageNum ); } else if ( physicalPage == 1) { l1.setText( "page " + pageNum ); } else if ( physicalPage == 2) { l2.setText( "page " + pageNum ); } else if ( physicalPage == 3) { l3.setText( "page " + pageNum ); } else if ( physicalPage == 4) { l4.setText( "page " + pageNum ); } else if ( physicalPage == 5) { l5.setText( "page " + pageNum ); } else if ( physicalPage == 6) { l6.setText( "page " + pageNum ); } else if ( physicalPage == 7) { l7.setText( "page " + pageNum ); } else if ( physicalPage == 8) { l8.setText( "page " + pageNum ); } else if ( physicalPage == 9) { l9.setText( "page " + pageNum ); } else if ( physicalPage == 10) { l10.setText( "page " + pageNum ); } else if ( physicalPage == 11) { l11.setText( "page " + pageNum ); } else if ( physicalPage == 12) { l12.setText( "page " + pageNum ); } else if ( physicalPage == 13) { l13.setText( "page " + pageNum ); } else if ( physicalPage == 14) { l14.setText( "page " + pageNum ); } else if ( physicalPage == 15) { l15.setText( "page " + pageNum ); } else if ( physicalPage == 16) { l16.setText( "page " + pageNum ); } else if ( physicalPage == 17) { l17.setText( "page " + pageNum ); } else if ( physicalPage == 18) { l18.setText( "page " + pageNum ); } else if ( physicalPage == 19) { l19.setText( "page " + pageNum ); } else if ( physicalPage == 20) { l20.setText( "page " + pageNum ); } else if ( physicalPage == 21) { l21.setText( "page " + pageNum ); } else if ( physicalPage == 22) { l22.setText( "page " + pageNum ); } else if ( physicalPage == 23) { l23.setText( "page " + pageNum ); } else if ( physicalPage == 24) { l24.setText( "page " + pageNum ); } else if ( physicalPage == 25) { l25.setText( "page " + pageNum ); } else if ( physicalPage == 26) { l26.setText( "page " + pageNum ); } else if ( physicalPage == 27) { l27.setText( "page " + pageNum ); } else if ( physicalPage == 28) { l28.setText( "page " + pageNum ); } else if ( physicalPage == 29) { l29.setText( "page " + pageNum ); } else if ( physicalPage == 30) { l30.setText( "page " + pageNum ); } else if ( physicalPage == 31) { l31.setText( "page " + pageNum ); } else if ( physicalPage == 32) { l32.setText( "page " + pageNum ); } else if ( physicalPage == 33) { l33.setText( "page " + pageNum ); } else if ( physicalPage == 34) { l34.setText( "page " + pageNum ); } else if ( physicalPage == 35) { l35.setText( "page " + pageNum ); } else if ( physicalPage == 36) { l36.setText( "page " + pageNum ); } else if ( physicalPage == 37) { l37.setText( "page " + pageNum ); } else if ( physicalPage == 38) { l38.setText( "page " + pageNum ); } else if ( physicalPage == 39) { l39.setText( "page " + pageNum ); } else if ( physicalPage == 40) { l40.setText( "page " + pageNum ); } else if ( physicalPage == 41) { l41.setText( "page " + pageNum ); } else if ( physicalPage == 42) { l42.setText( "page " + pageNum ); } else if ( physicalPage == 43) { l43.setText( "page " + pageNum ); } else if ( physicalPage == 44) { l44.setText( "page " + pageNum ); } else if ( physicalPage == 45) { l45.setText( "page " + pageNum ); } else if ( physicalPage == 46) { l46.setText( "page " + pageNum ); } else if ( physicalPage == 47) { l47.setText( "page " + pageNum ); } else if ( physicalPage == 48) { l48.setText( "page " + pageNum ); } else if ( physicalPage == 49) { l49.setText( "page " + pageNum ); } else if ( physicalPage == 50) { l50.setText( "page " + pageNum ); } else if ( physicalPage == 51) { l51.setText( "page " + pageNum ); } else if ( physicalPage == 52) { l52.setText( "page " + pageNum ); } else if ( physicalPage == 53) { l53.setText( "page " + pageNum ); } else if ( physicalPage == 54) { l54.setText( "page " + pageNum ); } else if ( physicalPage == 55) { l55.setText( "page " + pageNum ); } else if ( physicalPage == 56) { l56.setText( "page " + pageNum ); } else if ( physicalPage == 57) { l57.setText( "page " + pageNum ); } else if ( physicalPage == 58) { l58.setText( "page " + pageNum ); } else if ( physicalPage == 59) { l59.setText( "page " + pageNum ); } else if ( physicalPage == 60) { l60.setText( "page " + pageNum ); } else if ( physicalPage == 61) { l61.setText( "page " + pageNum ); } else if ( physicalPage == 62) { l62.setText( "page " + pageNum ); } else if ( physicalPage == 63) { l63.setText( "page " + pageNum ); } else { return; } } public void removePhysicalPage( int physicalPage ) { if ( physicalPage == 0 ) { l0.setText( null ); } else if ( physicalPage == 1) { l1.setText( null ); } else if ( physicalPage == 2) { l2.setText(null ); } else if ( physicalPage == 3) { l3.setText( null ); } else if ( physicalPage == 4) { l4.setText( null ); } else if ( physicalPage == 5) { l5.setText( null ); } else if ( physicalPage == 6) { l6.setText( null ); } else if ( physicalPage == 7) { l7.setText( null ); } else if ( physicalPage == 8) { l8.setText( null ); } else if ( physicalPage == 9) { l9.setText( null ); } else if ( physicalPage == 10) { l10.setText( null ); } else if ( physicalPage == 11) { l11.setText( null ); } else if ( physicalPage == 12) { l12.setText( null ); } else if ( physicalPage == 13) { l13.setText( null ); } else if ( physicalPage == 14) { l14.setText( null ); } else if ( physicalPage == 15) { l15.setText( null ); } else if ( physicalPage == 16) { l16.setText( null ); } else if ( physicalPage == 17) { l17.setText( null ); } else if ( physicalPage == 18) { l18.setText( null ); } else if ( physicalPage == 19) { l19.setText( null ); } else if ( physicalPage == 20) { l20.setText( null ); } else if ( physicalPage == 21) { l21.setText( null ); } else if ( physicalPage == 22) { l22.setText( null ); } else if ( physicalPage == 23) { l23.setText( null ); } else if ( physicalPage == 24) { l24.setText( null ); } else if ( physicalPage == 25) { l25.setText( null ); } else if ( physicalPage == 26) { l26.setText( null ); } else if ( physicalPage == 27) { l27.setText( null ); } else if ( physicalPage == 28) { l28.setText( null ); } else if ( physicalPage == 29) { l29.setText( null ); } else if ( physicalPage == 30) { l30.setText( null ); } else if ( physicalPage == 31) { l31.setText( null ); } else if ( physicalPage == 32) { l32.setText( null ); } else if ( physicalPage == 33) { l33.setText( null ); } else if ( physicalPage == 34) { l34.setText( null ); } else if ( physicalPage == 35) { l35.setText( null ); } else if ( physicalPage == 36) { l36.setText( null ); } else if ( physicalPage == 37) { l37.setText( null ); } else if ( physicalPage == 38) { l38.setText( null ); } else if ( physicalPage == 39) { l39.setText( null ); } else if ( physicalPage == 40) { l40.setText( null ); } else if ( physicalPage == 41) { l41.setText( null ); } else if ( physicalPage == 42) { l42.setText( null ); } else if ( physicalPage == 43) { l43.setText( null ); } else if ( physicalPage == 44) { l44.setText( null ); } else if ( physicalPage == 45) { l45.setText( null ); } else if ( physicalPage == 46) { l46.setText( null ); } else if ( physicalPage == 47) { l47.setText( null ); } else if ( physicalPage == 48) { l48.setText( null ); } else if ( physicalPage == 49) { l49.setText( null ); } else if ( physicalPage == 50) { l50.setText( null ); } else if ( physicalPage == 51) { l51.setText( null ); } else if ( physicalPage == 52) { l52.setText( null ); } else if ( physicalPage == 53) { l53.setText( null ); } else if ( physicalPage == 54) { l54.setText( null ); } else if ( physicalPage == 55) { l55.setText( null ); } else if ( physicalPage == 56) { l56.setText( null ); } else if ( physicalPage == 57) { l57.setText( null ); } else if ( physicalPage == 58) { l58.setText( null ); } else if ( physicalPage == 59) { l59.setText( null ); } else if ( physicalPage == 60) { l60.setText( null ); } else if ( physicalPage == 61) { l61.setText( null ); } else if ( physicalPage == 62) { l62.setText( null ); } else if ( physicalPage == 63) { l63.setText( null ); } else { return; } } public boolean action( Event e, Object arg ) { if ( e.target == runButton ) { setStatus( "RUN" ); runButton.disable(); stepButton.disable(); resetButton.disable(); kernel.run(); setStatus( "STOP" ); resetButton.enable(); return true; } else if ( e.target == stepButton ) { setStatus( "STEP" ); kernel.step(); if (kernel.runcycles == kernel.runs) { stepButton.disable(); runButton.disable(); } setStatus("STOP"); return true; } else if ( e.target == resetButton ) { kernel.reset(); runButton.enable(); stepButton.enable(); return true; } else if ( e.target == exitButton ) { System.exit(0); return true; } else if ( e.target == b0 ) { kernel.getPage(0); return true; } else if ( e.target == b1 ) { kernel.getPage(1); return true; } else if ( e.target == b2 ) { kernel.getPage(2); return true; } else if ( e.target == b3 ) { kernel.getPage(3); return true; } else if ( e.target == b4 ) { kernel.getPage(4); return true; } else if ( e.target == b5 ) { kernel.getPage(5); return true; } else if ( e.target == b6 ) { kernel.getPage(6); return true; } else if ( e.target == b7 ) { kernel.getPage(7); return true; } else if ( e.target == b8 ) { kernel.getPage(8); return true; } else if ( e.target == b9 ) { kernel.getPage(9); return true; } else if ( e.target == b10 ) { kernel.getPage(10); return true; } else if ( e.target == b11 ) { kernel.getPage(11); return true; } else if ( e.target == b12 ) { kernel.getPage(12); return true; } else if ( e.target == b13 ) { kernel.getPage(13); return true; } else if ( e.target == b14 ) { kernel.getPage(14); return true; } else if ( e.target == b15 ) { kernel.getPage(15); return true; } else if ( e.target == b16 ) { kernel.getPage(16); return true; } else if ( e.target == b17 ) { kernel.getPage(17); return true; } else if ( e.target == b18 ) { kernel.getPage(18); return true; } else if ( e.target == b19 ) { kernel.getPage(19); return true; } else if ( e.target == b20 ) { kernel.getPage(20); return true; } else if ( e.target == b21 ) { kernel.getPage(21); return true; } else if ( e.target == b22 ) { kernel.getPage(22); return true; } else if ( e.target == b23 ) { kernel.getPage(23); return true; } else if ( e.target == b24 ) { kernel.getPage(24); return true; } else if ( e.target == b25 ) { kernel.getPage(25); return true; } else if ( e.target == b26 ) { kernel.getPage(26); return true; } else if ( e.target == b27 ) { kernel.getPage(27); return true; } else if ( e.target == b28 ) { kernel.getPage(28); return true; } else if ( e.target == b29 ) { kernel.getPage(29); return true; } else if ( e.target == b30 ) { kernel.getPage(30); return true; } else if ( e.target == b31 ) { kernel.getPage(31); return true; } else if ( e.target == b32 ) { kernel.getPage(32); return true; } else if ( e.target == b33 ) { kernel.getPage(33); return true; } else if ( e.target == b34 ) { kernel.getPage(34); return true; } else if ( e.target == b35 ) { kernel.getPage(35); return true; } else if ( e.target == b36 ) { kernel.getPage(36); return true; } else if ( e.target == b37 ) { kernel.getPage(37); return true; } else if ( e.target == b38 ) { kernel.getPage(38); return true; } else if ( e.target == b39 ) { kernel.getPage(39); return true; } else if ( e.target == b40 ) { kernel.getPage(40); return true; } else if ( e.target == b41 ) { kernel.getPage(41); return true; } else if ( e.target == b42 ) { kernel.getPage(42); return true; } else if ( e.target == b43 ) { kernel.getPage(43); return true; } else if ( e.target == b44 ) { kernel.getPage(44); return true; } else if ( e.target == b45 ) { kernel.getPage(45); return true; } else if ( e.target == b46 ) { kernel.getPage(46); return true; } else if ( e.target == b47 ) { kernel.getPage(47); return true; } else if ( e.target == b48 ) { kernel.getPage(48); return true; } else if ( e.target == b49 ) { kernel.getPage(49); return true; } else if ( e.target == b50 ) { kernel.getPage(50); return true; } else if ( e.target == b51 ) { kernel.getPage(51); return true; } else if ( e.target == b52 ) { kernel.getPage(52); return true; } else if ( e.target == b53 ) { kernel.getPage(53); return true; } else if ( e.target == b54 ) { kernel.getPage(54); return true; } else if ( e.target == b55 ) { kernel.getPage(55); return true; } else if ( e.target == b56 ) { kernel.getPage(56); return true; } else if ( e.target == b57 ) { kernel.getPage(57); return true; } else if ( e.target == b58 ) { kernel.getPage(58); return true; } else if ( e.target == b59 ) { kernel.getPage(59); return true; } else if ( e.target == b60 ) { kernel.getPage(60); return true; } else if ( e.target == b61 ) { kernel.getPage(61); return true; } else if ( e.target == b62 ) { kernel.getPage(62); return true; } else if ( e.target == b63 ) { kernel.getPage(63); return true; } else { return false; } } } memory/Instruction.java memory/Instruction.java public class Instruction { public String inst ; public long addr ; public Instruction ( String inst , long addr ) { this . inst = inst ; this . addr = addr ; } } memory/javadoc/AllNames.html All Packages Class Hierarchy A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Index of all Fields and Methods A action(Event, Object). Method in class ControlPanel addPhysicalPage(int, int). Method in class ControlPanel addr. Variable in class Instruction addressradix. Static variable in class Kernel B block. Variable in class Kernel C Common(). Constructor for class Common ControlPanel(). Constructor for class ControlPanel ControlPanel(String). Constructor for class ControlPanel G getPage(int). Method in class Kernel H high. Variable in class Page I id. Variable in class Page init(Kernel, String, String). Method in class ControlPanel init(String, String). Method in class Kernel inMemTime. Variable in class Page inst. Variable in class Instruction Instruction(String, long). Constructor for class Instruction K Kernel(). Constructor for class Kernel L lastTouchTime. Variable in class Page low. Variable in class Page M M. Variable in class Page main(String[]). Static method in class MemoryManagement MemoryManagement(). Constructor for class MemoryManagement P Page(int, int, byte, byte, int, int, long, long). Constructor for class Page PageFault(). Constructor for class PageFault pageNum(long, int, long). Static method in class Virtual2Physical paintPage(Page). Method in class ControlPanel physical. Variable in class Page R R. Variable in class Page randomLong(long). Static method in class Common removePhysicalPage(int). Method in class ControlPanel replacePage(Vector, int, int, ControlPanel). Static method in class PageFault The page replacement algorithm for the memory management sumulator. reset(). Method in class Kernel run(). Method in class Kernel runcycles. Variable in class Kernel runs. Variable in class Kernel S s2b(String). Static method in class Common s2i(String). Static method in class Common s2l(String). Static method in class Common setControlPanel(ControlPanel). Method in class Kernel setStatus(String). Method in class ControlPanel step(). Method in class Kernel V Virtual2Physical(). Constructor for class Virtual2Physical memory/javadoc/Common.html Class Common java.lang.Object | +----Common public class Common extends Object Common() randomLong(long) s2b(String) s2i(String) s2l(String) Common public Common() s2l public static long s2l(String s) s2i public static int s2i(String s) s2b public static byte s2b(String s) randomLong public static long randomLong(long MAX) memory/javadoc/ControlPanel.html Class ControlPanel java.lang.Object | +----java.awt.Component | +----java.awt.Container | +----java.awt.Window | +----java.awt.Frame | +----ControlPanel public class ControlPanel extends Frame ControlPanel() ControlPanel(String) action(Event, Object) addPhysicalPage(int, int) init(Kernel, String, String) paintPage(Page) removePhysicalPage(int) setStatus(String) ControlPanel public ControlPanel() ControlPanel public ControlPanel(String title) init public void init(Kernel useKernel, String commands, String config) paintPage public void paintPage(Page page) setStatus public void setStatus(String status) addPhysicalPage public void addPhysicalPage(int pageNum, int physicalPage) removePhysicalPage public void removePhysicalPage(int physicalPage) action public boolean action(Event e, Object arg) Overrides: action in class Component memory/javadoc/images/BaseObject.gif memory/javadoc/images/blue-ball-small.gif memory/javadoc/images/blue-ball.gif memory/javadoc/images/Category.gif memory/javadoc/images/class-index.gif memory/javadoc/images/Class.gif memory/javadoc/images/Collection.gif memory/javadoc/images/constructor-index.gif memory/javadoc/images/constructors.gif memory/javadoc/images/cyan-ball-small.gif memory/javadoc/images/cyan-ball.gif memory/javadoc/images/DataObject.gif memory/javadoc/images/error-index.gif memory/javadoc/images/exception-index.gif memory/javadoc/images/green-ball-small.gif memory/javadoc/images/green-ball.gif memory/javadoc/images/Group.gif memory/javadoc/images/interface-index.gif memory/javadoc/images/Interface.gif memory/javadoc/images/Job.gif memory/javadoc/images/JobOutput.gif memory/javadoc/images/JobParameter.gif memory/javadoc/images/magenta-ball-small.gif memory/javadoc/images/magenta-ball.gif memory/javadoc/images/method-index.gif memory/javadoc/images/methods.gif memory/javadoc/images/ObjectID.gif memory/javadoc/images/ObjectType.gif memory/javadoc/images/OpenBookIcon.gif memory/javadoc/images/package-index.gif memory/javadoc/images/Permissions.gif memory/javadoc/images/Query.gif memory/javadoc/images/QueryVector.gif memory/javadoc/images/red-ball-small.gif memory/javadoc/images/red-ball.gif memory/javadoc/images/ReportMartEntity.gif memory/javadoc/images/ReportMartException.gif memory/javadoc/images/Repository.gif memory/javadoc/images/Session.gif memory/javadoc/images/SessionFactory.gif memory/javadoc/images/SPFSet.gif memory/javadoc/images/SQRJob.gif memory/javadoc/images/SQRJobOutput.gif memory/javadoc/images/UnimplementedMethodException.gif memory/javadoc/images/UnknownReportMartException.gif memory/javadoc/images/User.gif memory/javadoc/images/UserValidationException.gif memory/javadoc/images/variable-index.gif memory/javadoc/images/variables.gif memory/javadoc/images/yellow-ball-small.gif memory/javadoc/images/yellow-ball.gif memory/javadoc/Instruction.html Class Instruction java.lang.Object | +----Instruction public class Instruction extends Object addr inst Instruction(String, long) inst public String inst addr public long addr Instruction public Instruction(String inst, long addr) memory/javadoc/Kernel.html Class Kernel java.lang.Object | +----java.lang.Thread | +----Kernel public class Kernel extends Thread addressradix block runcycles runs Kernel() getPage(int) init(String, String) reset() run() setControlPanel(ControlPanel) step() runs public int runs runcycles public int runcycles block public long block addressradix public static byte addressradix Kernel public Kernel() init public void init(String commands, String config) setControlPanel public void setControlPanel(ControlPanel newControlPanel) getPage public void getPage(int pageNum) run public void run() Overrides: run in class Thread step public void step() reset public void reset() memory/javadoc/MemoryManagement.html Class MemoryManagement java.lang.Object | +----MemoryManagement public class MemoryManagement extends Object MemoryManagement() main(String[]) MemoryManagement public MemoryManagement() main public static void main(String args[]) memory/javadoc/packages.html API User's Guide Class Hierarchy Index memory/javadoc/Page.html Class Page java.lang.Object | +----Page public class Page extends Object high id inMemTime lastTouchTime low M physical R Page(int, int, byte, byte, int, int, long, long) id public int id physical public int physical R public byte R M public byte M inMemTime public int inMemTime lastTouchTime public int lastTouchTime high public long high low public long low Page public Page(int id, int physical, byte R, byte M, int inMemTime, int lastTouchTime, long high, long low) memory/javadoc/PageFault.html Class PageFault java.lang.Object | +----PageFault public class PageFault extends Object PageFault() replacePage(Vector, int, int, ControlPanel) The page replacement algorithm for the memory management sumulator. PageFault public PageFault() replacePage public static void replacePage(Vector mem, int virtPageNum, int replacePageNum, ControlPanel controlPanel) The page replacement algorithm for the memory management sumulator. This method gets called whenever a page needs to be replaced. The page replacement algorithm included with the simulator is FIFO (first-in first-out). A while or for loop should be used to search through the current memory contents for a canidate replacement page. In the case of FIFO the while loop is used to find the proper page while making sure that virtPageNum is not exceeded. Page page = ( Page ) mem.elementAt( oldestPage ) This line brings the contents of the Page at oldestPage (a specified integer) from the mem vector into the page object. Next recall the contents of the target page, replacePageNum. Set the physical memory address of the page to be added equal to the page to be removed. controlPanel.removePhysicalPage( oldestPage ) Once a page is removed from memory it must also be reflected graphically. This line does so by removing the physical page at the oldestPage value. The page which will be added into memory must also be displayed through the addPhysicalPage function call. One must also remember to reset the values of the page which has just been removed from memory. Parameters: mem - is the vector which contains the contents of the pages in memory being simulated. mem should be searched to find the proper page to remove, and modified to reflect any changes. virtPageNum - is the number of virtual pages in the simulator (set in Kernel.java). replacePageNum - is the requested page which caused the page fault. controlPanel - represents the graphical element of the simulator, and allows one to modify the current display. memory/javadoc/tree.html All Packages Index Class Hierarchy class java.lang.Object class Common class java.awt.Component (implements java.awt.image.ImageObserver, java.awt.MenuContainer, java.io.Serializable) class java.awt.Container class java.awt.Window class java.awt.Frame (implements java.awt.MenuContainer) class ControlPanel class Instruction class MemoryManagement class Page class PageFault class java.lang.Thread (implements java.lang.Runnable) class Kernel class Virtual2Physical memory/javadoc/Virtual2Physical.html Class Virtual2Physical java.lang.Object | +----Virtual2Physical public class Virtual2Physical extends Object Virtual2Physical() pageNum(long, int, long) Virtual2Physical public Virtual2Physical() pageNum public static int pageNum(long memaddr, int numpages, long block) memory/Kernel.java memory/Kernel.java import java . lang . Thread ; import java . io . * ; import java . util . * ; //import Page; public class Kernel extends Thread { // The number of virtual pages must be fixed at 63 due to // dependencies in the GUI private static int virtPageNum = 63 ; private String output = null ; private static final String lineSeparator = System . getProperty ( "line.separator" ); private String command_file ; private String config_file ; private ControlPanel controlPanel ; private Vector memVector = new Vector (); private Vector instructVector = new Vector (); private String status ; private boolean doStdoutLog = false ; private boolean doFileLog = false ; public int runs ; public int runcycles ; public long block = ( int ) Math . pow ( 2 , 12 ); public static byte addressradix = 10 ; public void init ( String commands , String config ) { File f = new File ( commands ); command_file = commands ; config_file = config ; String line ; String tmp = null ; String command = "" ; byte R = 0 ; byte M = 0 ; int i = 0 ; int j = 0 ; int id = 0 ; int physical = 0 ; int physical_count = 0 ; int inMemTime = 0 ; int lastTouchTime = 0 ; int map_count = 0 ; double power = 14 ; long high = 0 ; long low = 0 ; long addr = 0 ; long address_limit = ( block * virtPageNum + 1 ) - 1 ; if ( config != null ) { f = new File ( config ); try { DataInputStream in = new DataInputStream ( new FileInputStream ( f )); while (( line = in . readLine ()) != null ) { if ( line . startsWith ( "numpages" )) { StringTokenizer st = new StringTokenizer ( line ); while ( st . hasMoreTokens ()) { tmp = st . nextToken (); virtPageNum = Common . s2i ( st . nextToken ()) - 1 ; if ( virtPageNum < 2 || virtPageNum >

63

)

{

System
.
out
.
println
(
“MemoryManagement: numpages out of bounds.”
);

System
.
exit
(
–
1
);

}

address_limit
=

(
block
*
virtPageNum
+
1
)
–
1
;

}

in
.
close
();

}

catch

(
IOException
e
)

{

/* Handle exceptions */

}

for

(
i
=

0
;
i
<= virtPageNum ; i ++ ) { high = ( block * ( i + 1 )) - 1 ; low = block * i ; memVector . addElement ( new Page ( i , - 1 , R , M , 0 , 0 , high , low )); } try { DataInputStream in = new DataInputStream ( new FileInputStream ( f )); while (( line = in . readLine ()) != null ) { if ( line . startsWith ( "memset" )) { StringTokenizer st = new StringTokenizer ( line ); st . nextToken (); while ( st . hasMoreTokens ()) { id = Common . s2i ( st . nextToken ()); tmp = st . nextToken (); if ( tmp . startsWith ( "x" )) { physical = - 1 ; } else { physical = Common . s2i ( tmp ); } if (( 0 >
id
||
id
>
virtPageNum
)

||

(
–
1

>
physical
||
physical
>

((
virtPageNum
–

1
)

/

2
)))

{

System
.
out
.
println
(
“MemoryManagement: Invalid page value in ”

+
config
);

System
.
exit
(
–
1
);

}

R
=

Common
.
s2b
(
st
.
nextToken
());

if

(
R
< 0 || R >

1
)

{

System
.
out
.
println
(
“MemoryManagement: Invalid R value in ”

+
config
);

System
.
exit
(
–
1
);

}

M
=

Common
.
s2b
(
st
.
nextToken
());

if

(
M
< 0 || M >

1
)

{

System
.
out
.
println
(
“MemoryManagement: Invalid M value in ”

+
config
);

System
.
exit
(
–
1
);

}

inMemTime
=

Common
.
s2i
(
st
.
nextToken
());

if

(
inMemTime
< 0 ) { System . out . println ( "MemoryManagement: Invalid inMemTime in " + config ); System . exit ( - 1 ); } lastTouchTime = Common . s2i ( st . nextToken ()); if ( lastTouchTime < 0 ) { System . out . println ( "MemoryManagement: Invalid lastTouchTime in " + config ); System . exit ( - 1 ); } Page page = ( Page ) memVector . elementAt ( id ); page . physical = physical ; page . R = R ; page . M = M ; page . inMemTime = inMemTime ; page . lastTouchTime = lastTouchTime ; } } if ( line . startsWith ( "enable_logging" )) { StringTokenizer st = new StringTokenizer ( line ); while ( st . hasMoreTokens ()) { if ( st . nextToken (). startsWith ( "true" ) ) { doStdoutLog = true ; } } } if ( line . startsWith ( "log_file" )) { StringTokenizer st = new StringTokenizer ( line ); while ( st . hasMoreTokens ()) { tmp = st . nextToken (); } if ( tmp . startsWith ( "log_file" ) ) { doFileLog = false ; output = "tracefile" ; } else { doFileLog = true ; doStdoutLog = false ; output = tmp ; } } if ( line . startsWith ( "pagesize" )) { StringTokenizer st = new StringTokenizer ( line ); while ( st . hasMoreTokens ()) { tmp = st . nextToken (); tmp = st . nextToken (); if ( tmp . startsWith ( "power" ) ) { power = ( double ) Integer . parseInt ( st . nextToken ()); block = ( int ) Math . pow ( 2 , power ); } else { block = Long . parseLong ( tmp , 10 ); } address_limit = ( block * virtPageNum + 1 ) - 1 ; } if ( block < 64 || block >

Math
.
pow
(
2
,
26
))

{

System
.
out
.
println
(
“MemoryManagement: pagesize is out of bounds”
);

System
.
exit
(
–
1
);

}

for

(
i
=

0
;
i
<= virtPageNum ; i ++ ) { Page page = ( Page ) memVector . elementAt ( i ); page . high = ( block * ( i + 1 )) - 1 ; page . low = block * i ; } } if ( line . startsWith ( "addressradix" )) { StringTokenizer st = new StringTokenizer ( line ); while ( st . hasMoreTokens ()) { tmp = st . nextToken (); tmp = st . nextToken (); addressradix = Byte . parseByte ( tmp ); if ( addressradix < 0 || addressradix >

20

)

{

System
.
out
.
println
(
“MemoryManagement: addressradix out of bounds.”
);

System
.
exit
(
–
1
);

}

in
.
close
();

}

catch

(
IOException
e
)

{

/* Handle exceptions */

}

}

f
=

new

File

(
commands
);

try

{

DataInputStream
in
=

new

DataInputStream
(
new

FileInputStream
(
f
));

while

((
line
=
in
.
readLine
())

!=

null
)

{

if

(
line
.
startsWith
(
“READ”
)

||
line
.
startsWith
(
“WRITE”
))

{

if

(
line
.
startsWith
(
“READ”
))

{

command
=

“READ”
;

}

if

(
line
.
startsWith
(
“WRITE”
))

{

command
=

“WRITE”
;

}

StringTokenizer
st
=

new

StringTokenizer
(
line
);

tmp
=
st
.
nextToken
();

if

(
tmp
.
startsWith
(
“random”
))

{

instructVector
.
addElement
(
new

Instruction
(
command
,
Common
.
randomLong
(
address_limit
)));

}

else

{

if

(
tmp
.
startsWith
(

“bin”

)

)

{

addr
=

Long
.
parseLong
(
st
.
nextToken
(),
2
);

}

else

if

(
tmp
.
startsWith
(

“oct”

)

)

{

addr
=

Long
.
parseLong
(
st
.
nextToken
(),
8
);

}

else

if

(
tmp
.
startsWith
(

“hex”

)

)

{

addr
=

Long
.
parseLong
(
st
.
nextToken
(),
16
);

}

else

{

addr
=

Long
.
parseLong
(
tmp
);

}

if

(
0

>
addr
||
addr
>
address_limit
)

{

System
.
out
.
println
(
“MemoryManagement: ”

+
addr
+

“, Address out of range in ”

+
commands
);

System
.
exit
(
–
1
);

}

instructVector
.
addElement
(
new

Instruction
(
command
,
addr
));

}

in
.
close
();

}

catch

(
IOException
e
)

{

/* Handle exceptions */

}

runcycles
=
instructVector
.
size
();

if

(
runcycles
< 1 ) { System . out . println ( "MemoryManagement: no instructions present for execution." ); System . exit ( - 1 ); } if ( doFileLog ) { File trace = new File ( output ); trace . delete (); } runs = 0 ; for ( i = 0 ; i < virtPageNum ; i ++ ) { Page page = ( Page ) memVector . elementAt ( i ); if ( page . physical != - 1 ) { map_count ++ ; } for ( j = 0 ; j < virtPageNum ; j ++ ) { Page tmp_page = ( Page ) memVector . elementAt ( j ); if ( tmp_page . physical == page . physical && page . physical >=

0
)

{

physical_count
++
;

}

if

(
physical_count
>

1
)

{

System
.
out
.
println
(
“MemoryManagement: Duplicate physical page’s in ”

+
config
);

System
.
exit
(
–
1
);

}

physical_count
=

0
;

}

if

(
map_count
< ( virtPageNum + 1 ) / 2 ) { for ( i = 0 ; i < virtPageNum ; i ++ ) { Page page = ( Page ) memVector . elementAt ( i ); if ( page . physical == - 1 && map_count < ( virtPageNum + 1 ) / 2 ) { page . physical = i ; map_count ++ ; } } } for ( i = 0 ; i < virtPageNum ; i ++ ) { Page page = ( Page ) memVector . elementAt ( i ); if ( page . physical == - 1 ) { controlPanel . removePhysicalPage ( i ); } else { controlPanel . addPhysicalPage ( i , page . physical ); } } for ( i = 0 ; i < instructVector . size (); i ++ ) { high = block * virtPageNum ; Instruction instruct = ( Instruction ) instructVector . elementAt ( i ); if ( instruct . addr < 0 || instruct . addr >
high
)

{

System
.
out
.
println
(
“MemoryManagement: Instruction (”

+
instruct
.
inst
+

” ”

+
instruct
.
addr
+

“) out of bounds.”
);

System
.
exit
(
–
1
);

}

public

void
setControlPanel
(
ControlPanel
newControlPanel
)

{

controlPanel
=
newControlPanel
;

}

public

void
getPage
(
int
pageNum
)

{

Page
page
=

(

Page

)
memVector
.
elementAt
(
pageNum
);

controlPanel
.
paintPage
(
page
);

}

private

void
printLogFile
(
String
message
)

{

String
line
;

String
temp
=

“”
;

File
trace
=

new

File
(
output
);

if

(
trace
.
exists
())

{

try

{

DataInputStream
in
=

new

DataInputStream
(

new

FileInputStream
(
output
)

);

while

((
line
=
in
.
readLine
())

!=

null
)

{

temp
=
temp
+
line
+
lineSeparator
;

}

in
.
close
();

}

catch

(

IOException
e
)

{

/* Do nothing */

}

try

{

PrintStream
out
=

new

PrintStream
(

new

FileOutputStream
(
output
)

);

out
.
print
(
temp
);

out
.
print
(
message
);

out
.
close
();

}

catch

(
IOException
e
)

{

/* Do nothing */

}

public

void
run
()

{

step
();

while

(
runs
!=
runcycles
)

{

try

{

Thread
.
sleep
(
2000
);

}

catch
(
InterruptedException
e
)

{

/* Do nothing */

}

step
();

}

public

void
step
()

{

int
i
=

0
;

Instruction
instruct
=

(

Instruction

)
instructVector
.
elementAt
(
runs
);

controlPanel
.
instructionValueLabel
.
setText
(
instruct
.
inst
);

controlPanel
.
addressValueLabel
.
setText
(

Long
.
toString
(
instruct
.
addr
,
addressradix
)

);

getPage
(

Virtual2Physical
.
pageNum
(
instruct
.
addr
,
virtPageNum
,
block
)

);

if

(
controlPanel
.
pageFaultValueLabel
.
getText
()

==

“YES”

)

{

controlPanel
.
pageFaultValueLabel
.
setText
(

“NO”

);

}

if

(
instruct
.
inst
.
startsWith
(

“READ”

)

)

{

Page
page
=

(

Page

)
memVector
.
elementAt
(

Virtual2Physical
.
pageNum
(
instruct
.
addr
,
virtPageNum
,
block
)

);

if

(
page
.
physical
==

–
1

)

{

if

(
doFileLog
)

{

printLogFile
(

“READ ”

+

Long
.
toString
(
instruct
.
addr
,
addressradix
)

+

” … page fault”

);

}

if

(
doStdoutLog
)

{

System
.
out
.
println
(

“READ ”

+

Long
.
toString
(
instruct
.
addr
,
addressradix
)

+

” … page fault”

);

}

PageFault
.
replacePage
(
memVector
,
virtPageNum
,

Virtual2Physical
.
pageNum
(
instruct
.
addr
,
virtPageNum
,
block
)

,
controlPanel
);

controlPanel
.
pageFaultValueLabel
.
setText
(

“YES”

);

}

else

{

page
.
R
=

1
;

page
.
lastTouchTime
=

0
;

if

(
doFileLog
)

{

printLogFile
(

“READ ”

+

Long
.
toString
(
instruct
.
addr
,
addressradix
)

+

” … okay”

);

}

if

(
doStdoutLog
)

{

System
.
out
.
println
(

“READ ”

+

Long
.
toString
(
instruct
.
addr
,
addressradix
)

+

” … okay”

);

}

if

(
instruct
.
inst
.
startsWith
(

“WRITE”

)

)

{

Page
page
=

(

Page

)
memVector
.
elementAt
(

Virtual2Physical
.
pageNum
(
instruct
.
addr
,
virtPageNum
,
block
)

);

if

(
page
.
physical
==

–
1

)

{

if

(
doFileLog
)

{

printLogFile
(

“WRITE ”

+

Long
.
toString
(
instruct
.
addr
,
addressradix
)

+

” … page fault”

);

}

if

(
doStdoutLog
)

{

System
.
out
.
println
(

“WRITE ”

+

Long
.
toString
(
instruct
.
addr
,
addressradix
)

+

” … page fault”

);

}

PageFault
.
replacePage
(
memVector
,
virtPageNum
,

Virtual2Physical
.
pageNum
(
instruct
.
addr
,
virtPageNum
,
block
)

,
controlPanel
);
controlPanel
.
pageFaultValueLabel
.
setText
(

“YES”

);

}

else

{

page
.
M
=

1
;

page
.
lastTouchTime
=

0
;

if

(
doFileLog
)

{

printLogFile
(

“WRITE ”

+

Long
.
toString
(
instruct
.
addr
,
addressradix
)

+

” … okay”

);

}

if

(
doStdoutLog
)

{

System
.
out
.
println
(

“WRITE ”

+

Long
.
toString
(
instruct
.
addr
,
addressradix
)

+

” … okay”

);

}

for

(
i
=

0
;
i
< virtPageNum ; i ++ ) { Page page = ( Page ) memVector . elementAt ( i ); if ( page . R == 1 && page . lastTouchTime == 10 ) { page . R = 0 ; } if ( page . physical != - 1 ) { page . inMemTime = page . inMemTime + 10 ; page.lastTouchTime = page.lastTouchTime + 10; } } runs++; controlPanel.timeValueLabel.setText( Integer.toString( runs*10 ) + " (ns)" ); } public void reset() { memVector.removeAllElements(); instructVector.removeAllElements(); controlPanel.statusValueLabel.setText( "STOP" ) ; controlPanel.timeValueLabel.setText( "0" ) ; controlPanel.instructionValueLabel.setText( "NONE" ) ; controlPanel.addressValueLabel.setText( "NULL" ) ; controlPanel.pageFaultValueLabel.setText( "NO" ) ; controlPanel.virtualPageValueLabel.setText( "x" ) ; controlPanel.physicalPageValueLabel.setText( "0" ) ; controlPanel.RValueLabel.setText( "0" ) ; controlPanel.MValueLabel.setText( "0" ) ; controlPanel.inMemTimeValueLabel.setText( "0" ) ; controlPanel.lastTouchTimeValueLabel.setText( "0" ) ; controlPanel.lowValueLabel.setText( "0" ) ; controlPanel.highValueLabel.setText( "0" ) ; init( command_file , config_file ); } } __MACOSX/memory/._Kernel.java memory/memory.conf // memset virt page # physical page # R (read from) M (modified) inMemTime (ns) lastTouchTime (ns) memset 0 0 0 0 0 0 memset 1 1 0 0 0 0 memset 2 2 0 0 0 0 memset 3 3 0 0 0 0 memset 4 4 0 0 0 0 memset 5 5 0 0 0 0 memset 6 6 0 0 0 0 memset 7 7 0 0 0 0 memset 8 8 0 0 0 0 memset 9 9 0 0 0 0 memset 10 10 0 0 0 0 memset 11 11 0 0 0 0 memset 12 12 0 0 0 0 memset 13 13 0 0 0 0 memset 14 14 0 0 0 0 memset 15 15 0 0 0 0 memset 16 16 0 0 0 0 memset 17 17 0 0 0 0 memset 18 18 0 0 0 0 memset 19 19 0 0 0 0 memset 20 20 0 0 0 0 memset 21 21 0 0 0 0 memset 22 22 0 0 0 0 memset 23 23 0 0 0 0 memset 24 24 0 0 0 0 memset 25 25 0 0 0 0 memset 26 26 0 0 0 0 memset 27 27 0 0 0 0 memset 28 28 0 0 0 0 memset 29 29 0 0 0 0 memset 30 30 0 0 0 0 memset 31 31 0 0 0 0 // enable_logging 'true' or 'false' // When true specify a log_file or leave blank for stdout enable_logging true // log_file
// Where is the name of the file you want output
// to be print to.
log_file tracefile
// page size, defaults to 2^14 and cannot be greater than 2^26
// pagesize or <'power' num (base 2)>
pagesize 16384
// addressradix sets the radix in which numerical values are displayed
// 2 is the default value
// addressradix
addressradix 16
// numpages sets the number of pages (physical and virtual)
// 64 is the default value
// numpages must be at least 2 and no more than 64
// numpages
numpages 64

memory/MemoryManagement.java

// The main MemoryManagement program

import
java
.
applet
.
*
;

import
java
.
awt
.
*
;

import
java
.
io
.
*
;

import
java
.
util
.
*
;

//import ControlPanel;

//import PageFault;

//import Virtual2Physical;

//import Common;

//import Page;

public

class

MemoryManagement

{

public

static

void
main
(
String
[]
args
)

{

ControlPanel
controlPanel
;

Kernel
kernel
;

if

(
args
.
length
< 1 || args . length >

2

)

{

System
.
out
.
println
(

“Usage: ‘java MemoryManagement ‘”

);

System
.
exit
(

–
1

);

}

File
f
=

new

File
(
args
[
0
]

);

if

(

!

(
f
.
exists
()

)

)

{

System
.
out
.
println
(

“MemoryM: error, file ‘”

+
f
.
getName
()

+

“‘ does not exist.”

);

System
.
exit
(

–
1

);

}

if

(

!

(
f
.
canRead
()

)

)

{

System
.
out
.
println
(

“MemoryM: error, read of ”

+
f
.
getName
()

+

” failed.”

);

System
.
exit
(

–
1

);

}

if

(
args
.
length
==

2

)

{

f
=

new

File
(
args
[
1
]

);

if

(

!

(
f
.
exists
()

)

)

{

System
.
out
.
println
(

“MemoryM: error, file ‘”

+
f
.
getName
()

+

“‘ does not exist.”

);

System
.
exit
(

–
1

);

}

if

(

!

(
f
.
canRead
()

)

)

{

System
.
out
.
println
(

“MemoryM: error, read of ”

+
f
.
getName
()

+

” failed.”

);

System
.
exit
(

–
1

);

}

kernel
=

new

Kernel
();

controlPanel
=

new

ControlPanel
(

“Memory Management”

);

if

(
args
.
length
==

1

)

{

controlPanel
.
init
(
kernel
,
args
[
0
]

,

null

);

}

else

{

controlPanel
.
init
(
kernel
,
args
[
0
]

,
args
[
1
]

);

}

__MACOSX/memory/._MemoryManagement.java

memory/Page.java

public

class

Page

{

public

int
id
;

public

int
physical
;

public

byte
R
;

public

byte
M
;

public

int
inMemTime
;

public

int
lastTouchTime
;

public

long
high
;

public

long
low
;

public

Page
(

int
id
,

int
physical
,

byte
R
,

byte
M
,

int
inMemTime
,

int
lastTouchTime
,

long
high
,

long
low
)

{

this
.
id
=
id
;

this
.
physical
=
physical
;

this
.
R
=
R
;

this
.
M
=
M
;

this
.
inMemTime
=
inMemTime
;

this
.
lastTouchTime
=
lastTouchTime
;

this
.
high
=
high
;

this
.
low
=
low
;

}

memory/PageFault.java

/* It is in this file, specifically the replacePage function that will

be called by MemoryManagement when there is a page fault. The

users of this program should rewrite PageFault to implement the

page replacement algorithm.

// This PageFault file is an example of the FIFO Page Replacement Algorithm.

import
java
.
util
.
*
;

//import Page;

public

class

PageFault

{

/**

* The page replacement algorithm for the memory management sumulator.

* This method gets called whenever a page needs to be replaced.

* The page replacement algorithm included with the simulator is

* FIFO (first-in first-out). A while or for loop should be used

* to search through the current memory contents for a canidate

* replacement page. In the case of FIFO the while loop is used

* to find the proper page while making sure that virtPageNum is

* not exceeded.


   *   Page page = ( Page ) mem.elementAt( oldestPage )

   *

* This line brings the contents of the Page at oldestPage (a

* specified integer) from the mem vector into the page object.

* Next recall the contents of the target page, replacePageNum.

* Set the physical memory address of the page to be added equal

* to the page to be removed.


   *   controlPanel.removePhysicalPage( oldestPage )

   *

* Once a page is removed from memory it must also be reflected

* graphically. This line does so by removing the physical page

* at the oldestPage value. The page which will be added into

* memory must also be displayed through the addPhysicalPage

* function call. One must also remember to reset the values of

* the page which has just been removed from memory.

*
@param
mem is the vector which contains the contents of the pages

* in memory being simulated. mem should be searched to find the

* proper page to remove, and modified to reflect any changes.

*
@param
virtPageNum is the number of virtual pages in the

* simulator (set in Kernel.java).

*
@param
replacePageNum is the requested page which caused the

* page fault.

*
@param
controlPanel represents the graphical element of the

* simulator, and allows one to modify the current display.

public

static

void
replacePage
(

Vector
mem
,

int
virtPageNum
,

int
replacePageNum
,

ControlPanel
controlPanel
)

{

int
count
=

0
;

int
oldestPage
=

–
1
;

int
oldestTime
=

0
;

int
firstPage
=

–
1
;

int
map_count
=

0
;

boolean
mapped
=

false
;

while

(

!

(
mapped
)

||
count
!=
virtPageNum
)

{

Page
page
=

(

Page

)
mem
.
elementAt
(
count
);

if

(
page
.
physical
!=

–
1

)

{

if

(
firstPage
==

–
1
)

{

firstPage
=
count
;

}

if

(
page
.
inMemTime
>
oldestTime
)

{

oldestTime
=
page
.
inMemTime
;

oldestPage
=
count
;

mapped
=

true
;

}

count
++
;

if

(
count
==
virtPageNum
)

{

mapped
=

true
;

}

if

(
oldestPage
==

–
1
)

{

oldestPage
=
firstPage
;

}

Page
page
=

(

Page

)
mem
.
elementAt
(
oldestPage
);

Page
nextpage
=

(

Page

)
mem
.
elementAt
(
replacePageNum
);

controlPanel
.
removePhysicalPage
(
oldestPage
);

nextpage
.
physical
=
page
.
physical
;

controlPanel
.
addPhysicalPage
(
nextpage
.
physical
,
replacePageNum
);

page
.
inMemTime
=

0
;

page
.
lastTouchTime
=

0
;

page
.
R
=

0
;

page
.
M
=

0
;

page
.
physical
=

–
1
;

}

__MACOSX/memory/._PageFault.java

memory/tracefile
READ 4 … okay
READ 13 … okay
WRITE cc32 … okay
READ 4000 … okay
READ 4000 … okay
WRITE 6001 … okay
WRITE 43bad … okay

memory/user_guide.html

MOSS
Memory Management Simulator
User Guide

Purpose

This document is a user guide for the MOSS
Memory Management Simulator. It explains how to use the simulator and
describes the display and the various input files used by and output files
produced by the simulator. The MOSS software
is designed for use with
Andrew S. Tanenbaum,
Modern Operating Systems, 2nd Edition
(Prentice Hall, 2001).
The Memory Management Simulator was written by
Alex Reeder
(alexr@e-sa.org).
This user guide was written by
Ray Ontko
(rayo@ontko.com).

This user guide assumes that you have already installed and tested
the simulator. If you are looking for installation information,
please read the
Installation Guide for
Unix/Linux/Solaris/HP-UX Systems or the
Installation Guide for
Win95/98/Me/NT/2000 Systems.

Introduction

The memory management simulator illustrates page fault behavior
in a paged virtual memory system. The program reads the initial
state of the page table and a sequence of virtual memory
instructions and writes a trace log indicating the effect of each
instruction. It includes a graphical user interface so that
students can observe page replacement algorithms at work. Students
may be asked to implement a particular page replacement algorithm
which the instructor can test by comparing the output from the
student’s algorithm to that produced by a working implementation.

Running the Simulator

The program reads a command file, optionally reads
a configuration file, displays a GUI window which
allows you to execute the command file, and optionally
writes a trace file.

To run the program, enter the following command line.

$ java MemoryManagement commands memory.conf

The program will display a window allowing you to run the
simulator. You will notice a row of command buttons across
the top, two columns of “page” buttons at the left, and an
informational display at the right.

Typically you will
use the step button to execute a command from the
input file, examine information about any pages by clicking
on a page button, and when you’re done, quit the
simulation using the exit button.

The buttons:

Button
Description

run
runs the simulation to completion. Note that the simulation
pauses and updates the screen between each step.

step
runs a single setup of the simulation and updates the display.

reset
initializes the simulator and starts from the beginning of
the command file.

exit
exits the simulation.

page n
display information about this virtual page in the display
area at the right.

The informational display:

Field
Description

status:
RUN, STEP, or STOP. This indicates whether the current
run or step is completed.

time:
number of “ns” since the start of the simulation.

instruction:
READ or WRITE. The operation last performed.

address:
the virtual memory address of the operation last performed.

page fault:
whether the last operation caused a page fault to occur.

virtual page:
the number of the virtual page being displayed in the
fields below. This is the last virtual page accessed by the simulator,
or the last page n button pressed.

physical page:
the physical page for this virtual page, if any. -1
indicates that no physical page is associated with this virtual page.

R:
whether this page has been read. (1=yes, 0=no)

M:
whether this page has been modified. (1=yes, 0=no)

inMemTime:
number of ns ago the physical page was allocated to this virtual
page.

lastTouchTime:
number of ns ago the physical page was last modified.

low:
low virtual memory address of the virtual page.

high:
high virtual memory address of the virtual page.

The Command File

The command file for the simulator specifies a sequence
of memory instructions to be performed. Each instruction
is either a memory READ or WRITE operation, and includes
a virtual memory address to be read or written. Depending on whether
the virtual page for the address is present in physical
memory, the operation will succeed, or, if not, a page fault
will occur.

Operations on Virtual Memory

There are two operations one can carry out on pages in memory:
READ and WRITE.

The format for each command is

operation address

or
operation random

where operation is READ or WRITE,
and address is the numeric virtual memory address, optionally
preceeded by one of the radix keywords bin, oct,
or hex. If no radix is supplied, the number is assumed
to be decimal.
The keyword random will generate a random virtual
memory address
(for those who want to experiment quickly)
rather than having to type an address.

For example, the sequence

READ bin 01010101
WRITE bin 10101010
READ random
WRITE random

causes the virtual memory manager to:
read from virtual memory address 85

write to virtual memory address 170

read from some random virtual memory address

write to some random virtual memory address

Sample Command File

The “commands” input file looks like this:

// Enter READ/WRITE commands into this file
// READ
// WRITE
READ bin 100
READ 19
WRITE hex CC32
READ bin 100000000000000
READ bin 100000000000000
WRITE bin 110000000000001
WRITE random

The Configuration File

The configuration file memory.conf is used to specify the
the initial content of the virtual memory map
(which pages of virtual
memory are mapped to which pages in physical memory)
and provide other configuration information, such
as whether operation should be logged to a file.
Setting Up the Virtual Memory Map

The memset command is used to initialize each
entry in the virtual page map.
memset is followed by six integer values:

The virtual page # to initialize

The physical page # associated with this virtual page
(-1 if no page assigned)

If the page has been read from (R) (0=no, 1=yes)

If the page has been modified (M) (0=no, 1=yes)

The amount of time the page has been in memory (in ns)

The last time the page has been modified (in ns)

The first two parameters define the mapping between
the virtual page and a physical page, if any.
The last four parameters are values that might be used
by a page replacement algorithm.
For example,

memset 34 23 0 0 0 0

specifies that virtual page 34 maps to physical page 23,
and that the page has not been read or modified.
Note:

Each physical page should be mapped to exactly one virtual page.

The number of virtual pages is fixed at 64 (0..63).

The number of physical pages cannot exceed 64 (0..63).

If a virtual page is not specified by any memset command,
it is assumed that the page is not mapped.

Other Configuration File Options

There are a number of other options which can
be specified in the configuration file. These are
summarized in the table below.

Keyword Values Description

enable_logging true
false
Whether logging of the operations should be enabled. If logging
is enabled, then the program writes a one-line message for each
READ or WRITE operation. By default, no logging is enabled.
See also the log_file option.

log_file
trace-file-name
The name of the file to which log messages should be written.
If no filename is given, then log messages are written to stdout.
This option has no effect if enable_logging is false
or not specified.

pagesize n
power p
The size of the page in bytes as a power of two.
This can be given as a decimal number which is a
power of two (1, 2, 4, 8, etc.) or as a power of two using
the power keyword. The maximum page size is
67108864 or power 26. The default
page size is power 26.

addressradix n
The radix in which numerical values are displayed.
The default radix is 2 (binary). You may prefer radix
8 (octal), 10 (decimal), or 16 (hexadecimal).

Sample Configuration File

The “memory.conf” configuration file looks like this:

// memset virt page # physical page # R (read from) M (modified) inMemTime (ns) lastTouchTime (ns)
memset 0 0 0 0 0 0
memset 1 1 0 0 0 0
memset 2 2 0 0 0 0
memset 3 3 0 0 0 0
memset 4 4 0 0 0 0
memset 5 5 0 0 0 0
memset 6 6 0 0 0 0
memset 7 7 0 0 0 0
memset 8 8 0 0 0 0
memset 9 9 0 0 0 0
memset 10 10 0 0 0 0
memset 11 11 0 0 0 0
memset 12 12 0 0 0 0
memset 13 13 0 0 0 0
memset 14 14 0 0 0 0
memset 15 15 0 0 0 0
memset 16 16 0 0 0 0
memset 17 17 0 0 0 0
memset 18 18 0 0 0 0
memset 19 19 0 0 0 0
memset 20 20 0 0 0 0
memset 21 21 0 0 0 0
memset 22 22 0 0 0 0
memset 23 23 0 0 0 0
memset 24 24 0 0 0 0
memset 25 25 0 0 0 0
memset 26 26 0 0 0 0
memset 27 27 0 0 0 0
memset 28 28 0 0 0 0
memset 29 29 0 0 0 0
memset 30 30 0 0 0 0
memset 31 31 0 0 0 0
// enable_logging ‘true’ or ‘false’
// When true specify a log_file or leave blank for stdout
enable_logging true
// log_file
// Where is the name of the file you want output
// to be print to.
log_file tracefile
// page size, defaults to 2^14 and cannot be greater than 2^26
// pagesize or
pagesize 16384
// addressradix sets the radix in which numerical values are displayed
// 2 is the default value
// addressradix
addressradix 16
// numpages sets the number of pages (physical and virtual)
// 64 is the default value
// numpages must be at least 2 and no more than 64
// numpages
numpages 64

The Output File

The output file contains a log of the operations
since the simulation started (or since the last reset).
It lists the command that was attempted and what happened
as a result. You can review this file after executing
the simulation.

The output file contains one line per operation
executed. The format of each line is:

command address … status

where:
command is READ or WRITE,

address is a number corresponding to a virtual memory address, and

status is okay or page fault.

Sample Output

The output “tracefile” looks something like this:

READ 4 … okay
READ 13 … okay
WRITE 3acc32 … okay
READ 10000000 … okay
READ 10000000 … okay
WRITE c0001000 … page fault
WRITE 2aeea2ef … okay

Suggested Exercises

Create a command file that maps any 8 pages of physical memory to
the first 8 pages of virtual memory, and then reads from one virtual
memory address on each
of the 64 virtual pages. Step through
the simulator one operation at a time and see if you can predict
which virtual memory addresses cause page faults. What page replacement
algorithm is being used?

Modify replacePage() in
PageFault.java to implement a round robin
page replacement algorithm
(i.e., first page fault
replaces page 0, next one replaces page 1, next one replaces page 2,
etc.).

Modify replacePage() in
PageFault.java to implement a least recently used
(LRU) page replacement algorithm.

To Do

The user guide should tell a little bit about how replacePage works, e.g.
what data structures it uses, what the arguments are,
how it operates, how it makes it choice known, etc.
Add a section of documentation on how to implement a new
page replacement algorithm. This should explain a little about
what changes are needed in the GUI, what to call the page replacement
class, what fields and methods it needs to provide, and what
other changes might be needed.

© Copyright 2001, Prentice-Hall, Inc.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program (see copying.txt);
if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA

Please send suggestions, corrections, and comments to
Ray Ontko
(rayo@ontko.com).

Last updated: July 28, 2001

memory/user_guide_1.gif

memory/Virtual2Physical.java

import
java
.
util
.
Vector
;

public

class

Virtual2Physical

{

public

static

int
pageNum
(

long
memaddr
,

int
numpages
,

long
block
)

{

int
i
=

0
;

long
high
=

0
;

long
low
=

0
;

for

(
i
=

0
;
i
<= numpages ; i ++ ) { low = block * i ; high = block * ( i + 1 ); if ( low <= memaddr && memaddr < high ) { return i ; } } return - 1 ; } } __MACOSX/Data structures assignment/._memory.zip Data structures assignment/week9b S5FS CI583: Data Structures and Operating Systems File systems: S5FS 1 / 12 S5FS S5FS Typically of Thompson and Ritchie, the early UNIX file system was elegantly simple. S5FS was the version shipped with version 5 of UNIX. Files can be accessed efficiently whether in sequential or random order. Performance and robustness, however, are not sufficient for modern use. (Improving it at the time would have been premature optimisation – at a time when processors handled tens or hundreds of instructions per second, rather than millions, S5FS was fast enough.) 2 / 12 S5FS S5FS In an S5FS disk, the first block after the boot block is occupied by the superblock, which describes the layout of the rest of the disk and contains pointers to the heads of various lists used to manage free space. Superblock I-list Data Region0 1 2 3 4 5 ... n 3 / 12 S5FS S5FS Next is a block(s) containing an array of inodes, each of which describes a file or is free. Next is the data region. Superblock I-list Data Region0 1 2 3 4 5 ... n 4 / 12 S5FS S5FS Each occupied inode describes a file. Directories are files containing pairs of pathnames (e.g. /home/jim/.emacs) and inode numbers, which are indexes into the i-list. Device Inode number Mode Link count Owner, group Size Diskmap 5 / 12 S5FS S5FS The most interesting part of the indode is the disk map, which maps logical block numbers to physical blocks. Each block is 1024 bytes long. The first ten entries map directly to the first ten blocks (10kB) in the file. 0 1 2 3 4 5 6 7 8 9 10 11 12 Disk Map Data BlocksIndirect Blocks Double Indirect Blocks Triple Indirect Block Figure adapted from Doeppner, Operating Systems in Depth. 6 / 12 S5FS S5FS The next entry in the disk map maps to an indirect block containing up to 256 4-byte pointers to real blocks, or 256kB of data. 0 1 2 3 4 5 6 7 8 9 10 11 12 Disk Map Data BlocksIndirect Blocks Double Indirect Blocks Triple Indirect Block 7 / 12 S5FS S5FS The next entry maps to a double indirect block, containing up to 256 pointers to indirect blocks, or 64MB of data. 0 1 2 3 4 5 6 7 8 9 10 11 12 Disk Map Data BlocksIndirect Blocks Double Indirect Blocks Triple Indirect Block 8 / 12 S5FS S5FS If the file is bigger than this, the next entry maps to a triple indirect block, containing up to 256 pointers to double indirect blocks, or 16GB of data. (The real limit is less than this because the file size needs to be stored in 32-bits.) 0 1 2 3 4 5 6 7 8 9 10 11 12 Disk Map Data BlocksIndirect Blocks Double Indirect Blocks Triple Indirect Block 9 / 12 S5FS S5FS Sequential and random access are fairly efficient, but might require several lookups per block. Imagine allocating 2GB of storage for an empty file: all pointers in the disk map are null except the last one, which points to the triple block. Only four blocks are required. 10 / 12 S5FS S5FS When we want to allocate a new free block, we don’t want to start looking through the disk. The same goes for freeing space – it should require very minimal IO. 11 / 12 S5FS S5FS A list of 100 free blocks is maintained in the superblock. When the list is empty, the disk is searched for another set of free blocks. When a block is liberated, it is added to the list of free blocks. The superblock maintains a list of free inodes in a similar way. 12 / 12 S5FS __MACOSX/Data structures assignment/._week9b Data structures assignment/week4e CI583: Data Structures and Operating Systems Recursion, and some recursive problems 1 / 16 MergeSort MergeSort is a recursive algorithm that is much more e�cient that the simple sorting methods we studied earlier: it is loglinear, or O(n log n), rather than O(n2). To sort 10,000 items, this means that MergeSort is proportional to 40,000 steps, while SelectionSort is proportional to 100,000,000 steps. If the MergeSort took 40 seconds, the SelectionSort would take in the region of 28 hours. The downside is that MergeSort uses extra memory, making it a good choice only if we have enough space. 2 / 16 MergeSort At the heart of MergeSort is the process of merging two sorted arrays, A and B, into a third array, C. We start with pointers to the front of A and B and compare the �rst elements. We move the smallest into C and increment the pointer for the array it came from. So, if the �rst element is moved from B[0] into C then the next step will be to compare A[0] and B[1], and so on. A and B need not be the same size. When we get to the end of either A or B we can move elements from the remaining array without any further comparisons. 3 / 16 MergeSort That explains how to merge two sorted arrays, but how did they become sorted in the �rst place? The idea behind MergeSort is to divide an array into two halves, sort the halves then merge them. We do this recursively, dividing the subarrays until we reach the base case: an array with one element. A one element array is already sorted, so when we have two of those, we just need to merge them. 4 / 16 MergeSort 64 21 33 70 12 85 44 3 64 21 33 70 12 85 44 3 64 21 33 70 64 21 33 70 12 85 44 3 12 85 44 3 21 64 33 70 12 85 3 44 21 33 64 70 3 12 44 85 3 12 21 33 44 64 70 85 5 / 16 MergeSort Although we are hiding the complexity in the Merge algorithm, the recursive MergeSort algorithm is extremely elegant. The storage for all of these smaller arrays comes from allocating a temporary array with the same size as the input. procedure MergeSort(array, �rst, last) if �rst < last then mid ← (�rst + last)/2 MergeSort(array, �rst, mid) MergeSort(array, mid + 1, last) Merge(array, �rst, mid, mid + 1, last) end if end procedure 6 / 16 Complexity of MergeSort As an informal consideration of the growth rate of MergeSort, consider the example from a few slides ago: 24 swaps were needed to sort 8 items. lg 8 is 3, and 8× lg 8 is 24. The swapping required, which is a similar process to a binary search, halving the problem each time, accounts for the logarithmic component in O(n log n). As for the number of comparisons, in the worst case, merging n items could take n −1 comparisons, and this is where the n in O(n log n) comes from. 7 / 16 QuickSort The last sorting algorithm we will look at is QuickSort. Like MergeSort, it is highly e�cient, but in this case only the average and best performance are O(n log n). The worst case performance is much worse, O(n2). Unlike MergeSort, however, QuickSort does not require extra memory. 8 / 16 QuickSort Because of the combination of low memory usage and very good average performance, QuickSort is one of the most popular sorting algorithms. You may �nd that sorting a collection with the standard library method in your favourite language uses it (Java does, for example). 9 / 16 QuickSort The idea is that we pick a �pivot element�, p, then move every element which is less than p to its left, and every element greater than p to its right. The two halves are not sorted, just less than or greater than p. At this point, p is in its �nal position. Then we call QuickSort recursively on the left and right. The actual sorting is done recursively by the method that moves elements before or after the pivot, so the real work is down on the way down. This is the opposite of MergeSort, where the sorting is done on the way up, merging smaller lists into bigger ones. 10 / 16 QuickSort procedure QuickSort(array, �rst, last) if �rst < last then p ← PivotList(array, �rst, last) QuickSort(array, �rst, p −1) QuickSort(array, p + 1, last) end if end procedure 11 / 16 QuickSort There are several ways that we might choose the value of p. We would like a method that avoids picking a very large or small value, relative to the rest of the list, but obviously we are unable to examine every element. An optimisation that has been found to be quite e�ective is to examine the �rst, last and middle elements and pick the median � this is the median-of-three approach. At the same time as choosing the pivot, we sort the three elements, so that the pivot is in the mid point. 12 / 16 QuickSort In the example coming up we use the simplest way of picking the pivot: set p to the �rst element then move any element less than p before it and leave everything else where it is. 13 / 16 QuickSort One run of PivotList: 64 21 33 70 12 85 44 3 p 64 21 33 70 12 85 44 3 p 6421 33 70 12 85 44 3 p 6421 33 70 12 85 44 3 p 6421 33 70 12 85 44 3 p 6421 33 7012 85 44 3 p 6421 33 7012 85 44 3 p 6421 33 7012 8544 3 p 6421 33 7012 8544 3 p 14 / 16 QuickSort When PivotList is called on a list with n elements it needs to do n −1 comparisons. The worst case is one in which every element is larger (or smaller) than p. In this case we end up with O(n2) performance, no better than BubbleSort. One situation that will cause this is if the list is already sorted or inversely sorted. 15 / 16 QuickSort In the best case, p will end up in the middle of the partition being sorted, and this halving-every-time gives us loglinear performance. Analysing the average case performance results in the same order: O(n log n). See, for example, the Cormen book for full details. 16 / 16 Next week Hashtables! 17 / 16 MergeSort QuickSort __MACOSX/Data structures assignment/._week4e Data structures assignment/week5a CI583: Data Structures and Operating Systems Introduction to Hashtables 1 / 20 Hash tables The hash table is an incredibly e�cient data structure. Under the right conditions, it provides better selection and update performance than a BST or any type of self-balancing tree. One of the only downsides of them is that they don't provide an easy way to visit the data in-order. They were invented in the early 50s by a programmer called H.P. Luhn working for IBM, and independently by several others. Applications include general purpose key->value maps in the
standard library of every programming language, database indices,
OS caching, and symbol tables used by compilers.

2 / 20

Hash tables

Values are stored in an array, to which keys provide an index. Cells
in the array are often called buckets.

In order to translate a key (which might be a string, numeric or
other value) into an array index, a hash function is used.

The hash function is a
constant time function, as is
array access, so the hash table
provides O(1) performance for
insertion, deletion and search.

However, this is only true if the
hash table is not more than
about two-thirds full, for
reasons that will be explained.

Image c©http://en.wikipedia.org/

3 / 20

http://en.wikipedia.org/

background

Imagine that we needed to produce an employee database for a
small company. Every employee has a unique payroll number and
various other bits of information, such as their contact details,
national insurance number, etc.

1 class Employee {

2 int empNumber;

3 String surname;

4 String forename;

5 //…

6 }

The employee numbers range from 1 to 1000 and there is no need
for deletions � if an employee leaves, we want to keep their records
on �le.

4 / 20

Motivation

In this case it’s perfectly reasonable to store Employee objects in
an array and use employee numbers as keys.

Whenever we run out of space, we can create a new, larger array
and copy the old records across. In this way we get the constant
time performance of an array.

However, very few keys are as well-behaved as this one.

These keys run sequentially and there will be no deletions, meaning
that we don’t need to worry about fragmentation. The maximum
employee number is a reasonable size for an array.

5 / 20

Motivation

What if we want to use a string as the key? This might be to store
employee records by National Insurance number, or to store a
dictionary of words and de�nitions.

Say we want to store the contents of a 50,000 word dictionary in an
array. Each de�nition occupies its own cell and we can look it up
without searching through the whole array if we know the index.

What we need is a way of converting a string into an appropriate
index number.

6 / 20

Hash functions

We know that there are various encodings used to map characters
to numbers, such as ASCII in which a=97, b=98 and so on.

However, ASCII runs from 0 to 255 and accommodates lots of
non-printable characters and symbols. We can make a simpler
system where a=1, b=2, up to z=26. We can make 0 stand for the
space character, so we have 27 characters.

How can we use this to encode a sequence of characters?

7 / 20

Hash functions

A simplistic approach would be to sum the code numbers for each
character. To encode the words �cats� we have

c=3,

a=1,

t=20, and

s=19.

Thus cats=43. If we restrict ourselves to 10-letter words the
largest code is that for �zzzzzzzzzz�: 260.

8 / 20

Hash functions

So, we can only store 260 unique values, but our dictionary
contains 50,000 word/de�nition pairs.

We could store a list of the de�nitions whose key have the same
encoding in each cell � �tin�, �give�, �tend� and hundreds of other
words all map to 43 in our encoding.

On average, each entry will contain a list of 192 de�nitions
(50, 000/260) and we have lost the constant time performance.

9 / 20

Hash functions

Our array was too small before, so we need to make it bigger…

At the other end of the scale, we could store each de�nition in its
own cell. So we need to map strings to unique indices.

Recall that we can think of a decimal number in terms of powers of
10. E.g.,

86, 2486 = 8×105 +6×104 +2×103 +4×102 +8×101 +6×100.

10 / 20

Hash functions

Similarly, we can break down a word into individual character codes
and multiply them by powers of 27 (the radix of our encoding):

key(”cats”) = 3 × 273 + 1 × 272 + 20 × 271 + 19 × 270.

This does indeed give us a unique key for every string.

11 / 20

Hash functions

Unfortunately, the range of numbers has become too large. The
biggest number, the key for �zzzzzzzzzz� is
26 × 279 + 26 × 278 . . . 26 × 270 � well into the trillions. There are
several reasons why we can’t handle an array that big!

The other problem is that most of the array will be empty � the
scheme assigns an index to any combination of 10 characters, most
of which aren’t English words.

12 / 20

Hash functions

We need to compress a huge range of numbers into one that will �t
in a reasonably sized array.

We have 50,000 words to store, so you might presume we need an
array with 50,000 elements, although we will actually want about
twice that amount, as will be come clear.

We can do this using the modulus operator, %:

index = hugeNumber(key) % arraySize.

This is an example of a hash function. A perfect hash maps distinct
keys to unique locations, but this is normally not possible.

13 / 20

Hash functions

In the case of our dictionary, max1 is more than 7 trillion and max2
is 100,000. Numbers in the big range will over�ow numeric types,
but we will deal with that later.

Several elements in the big
range may map to a single
element in the smaller range
and some elements in the small
range have nothing that maps
to them.

On average, we have one entry
for every two cells.

max1

…

max2

…

14 / 20

Handling collisions

Using a scheme like this, it is impossible to avoid collisions: this is
the situation in which two words map to the same array index and
this is the reason we make the hash table about twice as large as
the data to be stored.

There are several ways we could respond when a collision occurs.
The �rst, called open addressing is to search in some systematic
way for an empty location and store the new value there.

The second approach would be to store a collection of entries at
each index: this is called separate chaining, and makes it clear why
each index in the array is sometimes called a bucket.

15 / 20

Open addressing

The simplest way to handle open addressing is called linear probing:
If the location that we want to store a value in is occupied, we
search sequentially for the next unoccupied location.

Say �cats� and �banana� have the same hash, 54321. If we have
already stored the de�nition of �banana� and come to insert �cats�,
we look at index 54322, then 54323, and so on, until we �nd
somewhere to store the new value.

16 / 20

Open addressing

If we want to search for �cats� later on, we will �rst look at the
index we get by hashing the key: 54321.

If it were unoccupied, then the search simply failed. In this case,
the cell is occupied, but with the wrong data.

This should make it clear that we need to store the key as well as
any other data.

E.g. if our hash table just maps words to their de�nitions, we would
have looked up �cats� and retrieved the de�nition A long curved
fruit that grows in clusters and has soft pulpy �esh and yellow skin

when ripe.

17 / 20

Open addressing

So we are actually storing word/de�nition pairs and we can see that
the value stored at 54321 isn’t the right one.

We search forward sequentially � this is the linear probe. As soon
as we come to an empty cell, we know the search has failed.

18 / 20

Open addressing

Searching like this makes deletions problematic: say �kumquat� also
maps to 54321 and was inserted after �banana� but before �cats�,
so that it occupies index 54322.

If we later delete �kumquat� we need to store a special value there,
some sort of dummy Nil item.

Then the probe for �cats� does not stop when it encounters an
empty cell.

The search for the location for a new entry can use one that
contains this special value.

19 / 20

Open addressing

If there are a lot of deletions the table will contain lots of dummy
Nil items, making searching less e�cient. The number of items
searched to retrieve or insert an item is called the probe length.

Many hash table implementations don’t allow deletions for this
reason. Similarly, duplicates are normally disallowed and inserting a
value with an existing key just overwrites the existing value.

So, our insert algorithm will now probe for the �rst empty location,
starting with the result of the hash function for the key, but will
accept a location that already contains the key.

20 / 20

Hash tables
Hash functions
Handling collisions

__MACOSX/Data structures assignment/._week5a

Data structures assignment/week5c

CI583: Data Structures and Operating Systems

Implementing hashtables � prime numbers and

hash functions

1 / 13

A note on prime numbers

Prime numbers, those which no number divides evenly except 1 and

the number itself, are an essential part of many algorithms.

We often need to generate a new prime number, and some

algorithms call for very large ones. Unfortunately, most methods for

testing primality are O(2n).

A naive way to test primality:

1 boolean isPrime(int n) {

2 for(int i=2; i=0; i–) {

5 int c = key.charAt(i) – 96;

6 hashVal += pow27 * c;

7 pow27 *= 27;

8 }

9 return hashVal % arraySize;

10 }

6 / 13

Implementing a hash function

Note the presence of the constants 27 and 96. Why?

1 int hash1(String key) {

2 int hashVal = 0;

3 int pow27 = 1;//1, 27, 27*27, etc

4 for(int i=key.length () -1; i>=0; i–) {

5 int c = key.charAt(i) – 96;

6 hashVal += pow27 * c;

7 pow27 *= 27;

8 }

9 return hashVal % arraySize;

10 }

7 / 13

Implementing a hash function

We can improve on this. The �rst insight is to use Horner’s

Method: see (p41, Cormen).

This formula allows us to transform a monomial (polynomial with

only one term) expression into a computationally e�cient form:

vn4 + wn3 + xn2 + yn1 + zn0 = (((vn + w)n + x)n + y)n + z.

8 / 13

Implementing a hash function

Applying this insight we rewrite the method, this time starting with

the leftmost character:

1 int hash2(String key) {

2 int hashVal = key.charAt (0) – 96;

3 for(int i=0; i e′, then their
places are swapped. We continue in this way until we reach the
end of the collection, which is the end of the first pass.

5 / 20

Bubble sort

After the first pass, the last element in the collection is the largest
one, and that position is sorted.

On the next pass, we only need to consider n−1 values, where n is
the size of the input, and on the one after that, n − 2, and so on.
If, on any given pass, there are no swaps then the data is sorted
and we stop.

6 / 20

Bubble sort

One pass of bubble sort:

23 7 5 31 9 18 4 32 6

7 / 20

Bubble sort

One pass of bubble sort:

237 5 31 9 18 4 32 6

7 / 20

Bubble sort

One pass of bubble sort:

237 5 31 9 18 4 32 6

7 / 20

Bubble sort

One pass of bubble sort:

237 5 31 9 18 4 32 6

7 / 20

Bubble sort

One pass of bubble sort:

237 5 319 18 4 32 6

7 / 20

Bubble sort

One pass of bubble sort:

237 5 319 18 4 32 6

7 / 20

Bubble sort

One pass of bubble sort:

237 5 319 18 4 32 6

7 / 20

Bubble sort

One pass of bubble sort:

237 5 319 18 4 32 6

7 / 20

Bubble sort

One pass of bubble sort:

237 5 319 18 4 326

7 / 20

Bubble sort

One pass of bubble sort:

237 5 319 18 4 326

7 / 20

Bubble sort

(This is more low-level than pseudo-code should normally be, but I
think it’s the clearest way to present the algorithm.)

procedure BubbleSort(arr, n) . n is the length of arr
swapped . boolean – did we make any swaps this pass?
for i from n − 1 down to 0 do . loop backwards

swapped ← false
for j from 0 up to i do . loop forwards

if arr[j] > arr[j + 1] then
swap(arr[j], arr[j + 1])
swapped ← true

end if
end for
if swapped = false then

break
end if

end for
end procedure

8 / 20

Bubble sort
Complexity

On the first pass, bubble sort carries out n − 1 comparisons. In the
best case, there are no swaps and the algorithm terminates. We
will concentrate on the worst case.

The worst case is when the input was in reverse order: at the
beginning of the first pass, the largest element is in first place and
is swapped all the way to the end. At the beginning of the second
pass, the second largest element is in first place, etc.

9 / 20

Bubble sort
Complexity

So, the first pass does n− 1 comparisons, the second n− 2, and so
on.

Let W (n) be the worst case for n elements. Then

W (n) =
n−1∑
i=1

=
(n − 1)n

=
n2 −n

≈
1

2
n2

= O(n2)

10 / 20

Bubble sort
Complexity

As a rule of thumb, note that that any bubble sort implementation
will have inner and outer loops over the same collection, or some
equivalent structure.

When we see this pattern we can take it to mean O(n2), as we are
carrying out an O(n) task n times. The simplest sorting algorithms
are all in this order.

11 / 20

Selection sort

The idea behind selection sort is that we look through the whole
collection for the smallest element, then put that in first place. On
the next pass, we do the same thing but start with the second
element, and so on.

Selection sort requires the same number of comparisons as bubble
sort, but the number of swaps required is O(n). In bubble sort the
same element may be moved many times, but in selection sort
each element is likely to be moved much less often.

12 / 20

Selection sort

One pass of selection sort:

23 7 5 31 9 18 4 32 6

min

13 / 20

Selection sort

One pass of selection sort:

23 7 5 31 9 18 4 32 6

min

13 / 20

Selection sort

One pass of selection sort:

23 7 5 31 9 18 4 32 6

min

13 / 20

Selection sort

One pass of selection sort:

23 7 5 31 9 18 4 32 6

min

13 / 20

Selection sort

One pass of selection sort:

23 7 5 31 9 18 4 32 6

min

13 / 20

Selection sort

One pass of selection sort:

23 7 5 31 9 18 4 32 6

min

13 / 20

Selection sort

One pass of selection sort:

23 7 5 31 9 18 4 32 6

min

13 / 20

Selection sort

One pass of selection sort:

23 7 5 31 9 18 4 32 6

min

13 / 20

Selection sort

One pass of selection sort:

237 5 31 9 184 32 6

13 / 20

Selection sort

One pass of selection sort:

237 5 31 9 184 32 6

min

13 / 20

Selection sort

procedure SelectionSort(arr, n) . n is the length of arr
for out from 0 up to n − 2 do . stop at one before the last

element
min ← out
for in from out + 1 up to n − 1 do

if arr[min] > arr[in] then
min ← in

end if
end for
swap(out, min)

end for
end procedure

14 / 20

Selection sort
Complexity

In the worst case, we can see that selection sort is O(n2), by the
same reasoning as for bubble sort.

However, it was easy to optimise the best case for bubble sort by
detecting that the input was already sorted and ending after the
first pass. Since we don’t necessarily compare every element to
each other in selection sort, this isn’t so easy.

Still, making fewer swaps gives selection sort better average
performance.

15 / 20

Insertion sort

Most of the time, insertion sort has the best performance of the
simple sorts we’re looking at today. It is still O(n2) but about
twice as fast as bubble sort.

The idea is to start by sorting the first two elements then, as we
move along the collection, to insert each element in the right place
in the sorted part of the collection.

16 / 20

Insertion sort

So, part of the input is always sorted and we keep inserting items
into that part. The sorted part grows and the unsorted part shrinks
until there is nothing left to do.

We need to keep track of which part of the collection is sorted,
and we need to store temporary values as we make room for an
element to be moved.

17 / 20

Insertion sort

Part of a run of insertion sort:

23 7 5 31 9 18 4 32 6

unsorted

temp

18 / 20

Insertion sort

Part of a run of insertion sort:

23 75 31 9 18 4 32 6

unsorted

temp

18 / 20

Insertion sort

Part of a run of insertion sort:

237 5 31 9 18 4 32 6

unsorted

temp

18 / 20

Insertion sort

Part of a run of insertion sort:

237 531 9 18 4 32 6

unsorted

temp

18 / 20

Insertion sort

Part of a run of insertion sort:

237 531 9 18 4 32 6

unsorted

temp

18 / 20

Insertion sort

Part of a run of insertion sort:

237 531 9 18 4 32 6

unsorted

temp

18 / 20

Insertion sort

Part of a run of insertion sort:

237 531 9 18 4 32 6

unsorted

temp

18 / 20

Insertion sort

Part of a run of insertion sort:

2375 31 9 18 4 32 6

unsorted

temp

18 / 20

Insertion sort

Part of a run of insertion sort:

2375 31 9 18 4 32 6

unsorted

temp

18 / 20

Insertion sort

You will probably have to think about this and compare it to the
applet to convince yourself that it’s true.

procedure Insertion Sort(arr, n) . n is the length of arr
for out from 1 up to n − 1 do . out is the dividing line

tmp ← arr[out]
in ← out . start shifts at out
while in > 0 and arr[in − 1] > tmp do

arr[in] ← arr[in − 1]
in ← in − 1

end while
arr[in] ← tmp

end for
end procedure

19 / 20

Insertion sort
complexity

The previous two sorts reduce the size of the problem at each pass.
In this case we increase it. On the first pass, we make one
comparison, on the second, a maximum of two, and so on.

1 + 2 + · · · + n − 1 =
n(n − 1)

So the worst case is the same: O(n2). However, insertion sort
performs much better when the data is sorted or “almost” sorted.

20 / 20

Simple sorting algorithms
Bubble sort
Selection sort
Insertion sort

__MACOSX/Data structures assignment/._week2a

Data structures assignment/week2b

CI583: Data Structures and Operating Systems

A sorting algorithm which is impressively efficient…

1 / 11

Radix sort

Our next sorting algorithm works completely differently to any of
the ones we’ve seen so far, and does it without actually comparing
values to each other. What’s more, it has O(n) time complexity!

This is radix sort. For each element in the input, radix sort looks
at one digit at a time starting with the least significant. Elements
with the same value for that digit are “thrown” into the same
bucket.

2 / 11

Radix sort

Consider sorting the following data:

[ 310, 213, 023, 130, 013, 301, 222, 032, 201, 111, 323, 002, 330,
102, 231, 120 ].

Note that some elements are padded so that all elements have the
same number of digits.

3 / 11

Radix sort

We start by collecting elements with the same first digit.

Bucket Contents
0 310 130 330 120
1 301 201 111 231
2 222 032 002 102
3 213 023 013 323

Emptying the buckets gives us a new list:

[ 310, 130, 330, 120, 301, 201, 111, 231, 222, 032, 002, 102, 213,
023, 013, 323 ].

4 / 11

Radix sort

We start by collecting elements with the same first digit.

Bucket Contents
0 310 130 330 120
1 301 201 111 231
2 222 032 002 102
3 213 023 013 323

Emptying the buckets gives us a new list:

[ 310, 130, 330, 120, 301, 201, 111, 231, 222, 032, 002, 102, 213,
023, 013, 323 ].

4 / 11

Radix sort

Using the new list, we collect elements with the same second digit.

Bucket Contents
0 301 201 002 201
1 310 111 213 013
2 120 222 023 323
3 130 330 231 032

Emptying the buckets again:

[ 301, 201, 002, 201, 310, 111, 213, 013, 120, 222, 023, 323, 130,
330, 231, 032 ].

5 / 11

Radix sort

Using the new list, we collect elements with the same second digit.

Bucket Contents
0 301 201 002 201
1 310 111 213 013
2 120 222 023 323
3 130 330 231 032

Emptying the buckets again:

[ 301, 201, 002, 201, 310, 111, 213, 013, 120, 222, 023, 323, 130,
330, 231, 032 ].

5 / 11

Radix sort

Finally, we collect elements with the same third digit.

Bucket Contents
0 002 013 023 032
1 102 111 120 130
2 201 213 222 231
3 301 310 323 330

This time, emptying the buckets will give us a sorted list.

6 / 11

Radix sort

Radix sort has the air of a card trick about it, but it actually
corresponds to how people sort things (such as socks:
http://tx0.org/5c3) in real life.

Sticking to computing, we can use radix sort on data with other
kinds of keys too, such as strings (using 26 buckets or 52 for a
case-sensitive sort).

7 / 11

http://tx0.org/5c3

Radix sort

Generally, we need as many buckets as the number base or radix of
the input. We need to inspect each element k times, where k is
the number of digits in the biggest element.

k will be relatively small compared to n (e.g. when k = 6, we
could have almost a million unique records). So the steps required
is in the order O(kn), or just O(n).

8 / 11

Radix sort

The algorithm can be stated very elegantly:

procedure radixSort(list, n) . Sort list having n elements
with base 10.

shift ← 1
for loop = 1 to keySize do

for entry = 1 to n do
bucketNum ← (list[entry]/shift) % 10
append(buckets[bucketNum], list[entry])

end for
list ← combineBuckets()
shift ← shift ∗ 10

end for
end procedure

9 / 11

Radix sort

Since radix sort works in linear time, why do we even bother with
other algorithms?

The catch is in the memory usage. This depends on how we
implement the algorithm. If buckets is a 2D array, each element
has to be as big as the original list, because the whole list might
end up in the same bucket.

Thus, we need Rn additional storage, where R is the radix (what
could we do to reduce memory usage?). Also, each element will be
moved 2k times.

10 / 11

Next time

More basic data structures: stacks, queues and priority queues.

11 / 11

Radix sort

__MACOSX/Data structures assignment/._week2b

Data structures assignment/deadlock.zip

deadlock/.DS_Store

__MACOSX/deadlock/._.DS_Store

deadlock/a0.dat
/*
a0.dat
The “a” collection of process data files is meant to simulate
two processes competing for a single resource. If you run
the simulator with one resource available, one of the processes
will block until the other is done using the resource.
*/
C 10 // compute for 10 milliseconds
R 0 // request resource 0
C 10 // compute for 10 milliseconds
F 0 // free resource 0
H // halt process

deadlock/a1.dat
/*
a1.dat
The “a” collection of process data files is meant to simulate
two processes competing for a single resource. If you run
the simulator with one resource available, one of the processes
will block until the other is done using the resource.
*/
C 10 // compute for 10 milliseconds
R 0 // request resource 0
C 10 // compute for 10 milliseconds
F 0 // free resource 0
H // halt process

deadlock/b0.dat
/*
b0.dat
The “b” collection of process data files is meant to simulate
two processes in a simple deadlock scenario. If you run
the simulator with two resources with one of each available,
each process will allocate one of the resources, and then
block on the resource that the other has allocated.
*/
C 10 // compute for 10 milliseconds
R 0 // request resource 0
C 10 // compute for 10 milliseconds
R 1 // request resource 1
C 10 // compute for 10 milliseconds
F 1 // free resource 1
F 0 // free resource 0
H // halt process

deadlock/b1.dat
/*
b1.dat
The “b” collection of process data files is meant to simulate
two processes in a simple deadlock scenario. If you run
the simulator with two resources with one of each available,
each process will allocate one of the resources, and then
block on the resource that the other has allocated.
*/
C 10 // compute for 10 milliseconds
R 1 // request resource 1
C 10 // compute for 10 milliseconds
R 0 // request resource 0
C 10 // compute for 10 milliseconds
F 0 // free resource 0
F 1 // free resource 1
H // halt process

deadlock/Command.java

public

class

Command

{

private

String
keyword
=

null

;

private

int
parameter
=

0

;

public

Command
()

{

super
();

}

public

Command
(

String
newKeyword
,

int
newParameter
)

{

super
();

setKeyword
(
newKeyword
)

;

setParameter
(
newParameter
)

;

}

public

String
getKeyword
()

{

return
keyword
;

}

public

int
getParameter
()

{

return
parameter
;

}

public

void
setKeyword
(

String
newKeyword
)

{

keyword
=
newKeyword
;

}

public

void
setParameter
(

int
newParameter
)

{

parameter
=
newParameter
;

}

public

String
toString
(

)

{

return

(
“Command[keyword=”
+
keyword
+
“,parameter=”
+
parameter
+
“]”

)

;

}

__MACOSX/deadlock/._Command.java

deadlock/CommandParser.java

import
java
.
io
.
*
;

public

class

CommandParser

{

private

StreamTokenizer
in
;

private

InputStream
inputStream
;

public

CommandParser
(

InputStream
newInputStream
)

{

super
()

;

inputStream
=
newInputStream
;

in
=

new

StreamTokenizer
(
inputStream
)

;

in
.
eolIsSignificant
(

true

)

;

in
.
ordinaryChar
(
‘/’
);

in
.
slashSlashComments
(

true

)

;

in
.
slashStarComments
(

true

)

;

}

public

void
close
()

throws

IOException

{

inputStream
.
close
()

;

}

public

Command
getCommand
()

{

String
commandLetter
=

null

;

int
commandNumber
=

0

;

try

{

int
t
;

int
state
=

0

;

boolean
looping
=

true

;

while

(
looping
)

{

t
=
in
.
nextToken
()

;

if

(
t
==

StreamTokenizer
.
TT_EOF
)

{

if

(
state
!=

0

)

throw

new

Exception
(

“unexpected text on line ”

+
in
.
lineno
()

)

;

else

return

null

;

}

switch

(
state
)

{

case

0
:
// expect letter

if

(
t
==

StreamTokenizer
.
TT_WORD
)

{

if

(
in
.
sval
.
equals
(

“C”

)

||

in
.
sval
.
equals
(

“R”

)

||

in
.
sval
.
equals
(

“F”

)

)

{

commandLetter
=
in
.
sval
;

state
=

1

;

}

else

if

(
in
.
sval
.
equals
(

“H”

)

)

{

commandLetter
=
in
.
sval
;

state
=

2

;

}

else

throw

new

Exception
(

“C,R,F, or H expected at start of line ”

+
in
.
lineno
()

)

;

}

else

if

(
t
==

StreamTokenizer
.
TT_EOL
)

;

// do nothing

else

throw

new

Exception
(

“C,R,F, or H expected at start of line ”

+
in
.
lineno
()

)

;

break

;

case

1
:

// expect parameter

if

(
t
==

StreamTokenizer
.
TT_NUMBER
)

{

commandNumber
=

(
int
)
in
.
nval
;

state
=

2

;

}

else

throw

new

Exception
(

“missing numeric value after command on line ”

+
in
.
lineno
());

break

;

case

2
:

// expect EOL

if

(
t
==

StreamTokenizer
.
TT_EOL
)

{

state
=

0

;

looping
=

false

;

}

else

throw

new

Exception
(

“unexpected text after command online ”

+
in
.
lineno
());

break

;

}

// System.out.println( “t ” + t + ” state ” + state + ” sval ” + in.sval + ” nval ” + in.nval ) ;

}

catch
(

IOException
e
)

{

System
.
out
.
println
(

“IOException ”

+
e
)

;

}

catch

(

Exception
e
)

{

System
.
out
.
println
(
e
.
toString
()

)

;

}

return

new

Command
(
commandLetter
,
commandNumber
)

;

}

public

static

void
main
(

String
args
[]

)

{

String
f
=
args
[
0
]

;

System
.
out
.
println
(

“filename ”

+
f
)

;

try

{

CommandParser
cp
=

new

CommandParser
(

new

BufferedInputStream
(
new

FileInputStream
(
f
))

);

while

(

true

)

{

Command
command
=
cp
.
getCommand
()

;

if

(
command
==

null

)

break

;

}

cp
.
close
()

;

}

catch

(

IOException
e
)

{

System
.
out
.
println
(

“IOException ”

+
e
)

;

}

__MACOSX/deadlock/._CommandParser.java

deadlock/ControlPanel.java

import
java
.
applet
.
*

;

import
java
.
awt
.
*

;

public

class

ControlPanel

extends

Frame

{

boolean
running
=

false

;

boolean
hasBeenReset
=

false

;

Kernel
kernel
;

Panel
buttonPanel
;

Button
runButton
;

Button
stopButton
;

Button
stepButton
;

Button
resetButton
;

Button
optionsButton
;

Button
processesButton
;

Button
resourcesButton
;

Button
exitButton
;

Panel
timePanel
;

Label
timeLabel
;

Label
timeValueLabel
;

TextField
timeTextField
;

Panel
topPanel
;

ProcessesPanel
processesPanel
;

ResourcesPanel
resourcesPanel
;

OptionsDialog
optionsDialog
;

ProcessesDialog
processesDialog
;

ResourcesDialog
resourcesDialog
;

public

ControlPanel
()

{

super
()

;

}

public

ControlPanel
(
String
title
)

{

super
(
title
)

;

}

public

void
init
(
Kernel
useKernel
)

{

kernel
=
useKernel
;

kernel
.
setControlPanel
(

this

)

;

runButton
=

new

Button
(

“run”

)

;

stopButton
=

new

Button
(

“stop”

)

;

stepButton
=

new

Button
(

“step”

)

;

resetButton
=

new

Button
(

“reset”

)

;

optionsButton
=

new

Button
(

“options”

)

;

processesButton
=

new

Button
(

“processes”

)

;

resourcesButton
=

new

Button
(

“resources”

)

;

exitButton
=

new

Button
(

“exit”

)

;

buttonPanel
=

new

Panel
(

)

;

buttonPanel
.
add
(
runButton
)

;

buttonPanel
.
add
(
stopButton
)

;

buttonPanel
.
add
(
stepButton
)

;

buttonPanel
.
add
(
resetButton
)

;

buttonPanel
.
add
(
optionsButton
)

;

buttonPanel
.
add
(
processesButton
)

;

buttonPanel
.
add
(
resourcesButton
)

;

buttonPanel
.
add
(
exitButton
)

;

timeLabel
=

new

Label
(

“Time: ”

,

Label
.
RIGHT
)

;

timeValueLabel
=

new

Label
(

Integer
.
toString
(
kernel
.
getTime
()

)

,

Label
.
LEFT
)

;

timeValueLabel
.
setForeground
(

Color
.
white
)

;

timeValueLabel
.
setBackground
(

Color
.
black
)

;

// timeTextField = new TextField( Integer.toString( kernel.getTime() ), 6) ;

// timeTextField.setEditable(false) ;

// timeTextField.enable(false);

// timeTextField.setForeground( Color.white ) ;

// timeTextField.setBackground( Color.black ) ;

timePanel
=

new

Panel
(

)

;

timePanel
.
add
(
timeLabel
)

;

timePanel
.
add
(
timeValueLabel
)

;

// timePanel.add( timeTextField ) ;

processesPanel
=

new

ProcessesPanel
(
kernel
.
getProcessCount
());

processesPanel
.
setProcesses
(
kernel
.
getProcesses
()

)

;

processesPanel
.
init
();

resourcesPanel
=

new

ResourcesPanel
(
kernel
.
getResourceCount
());

resourcesPanel
.
setResources
(
kernel
.
getResources
()

)

;

resourcesPanel
.
init
();

optionsDialog
=

new

OptionsDialog
(

this

,

“optionsDialog”

,

true

)

;

processesDialog
=

new

ProcessesDialog
(

this

,

“processesDialog”

,

true

)

;

processesDialog
.
setProcesses
(
kernel
.
getProcesses
());

resourcesDialog
=

new

ResourcesDialog
(

this

,

“resourcesDialog”

,

true

)

;

resourcesDialog
.
setResources
(
kernel
.
getResources
());

stopButton
.
disable
();

runButton
.
requestFocus
()

;

topPanel
=

new

Panel
()

;

topPanel
.
setLayout
(

new

BorderLayout
()

)

;

topPanel
.
add
(

“North”

,
buttonPanel
)

;

topPanel
.
add
(

“South”

,
timePanel
)

;

GridBagLayout
gbl
=

new

GridBagLayout
();

GridBagConstraints
gbc
=

new

GridBagConstraints
()

;

gbc
.
gridx
=

1

;

gbc
.
gridy
=

1

;

gbc
.
gridwidth
=

2

;

gbc
.
gridheight
=

1

;

gbl
.
setConstraints
(
topPanel
,
gbc
)

;

gbc
.
gridx
=

1

;

gbc
.
gridy
=

2

;

gbc
.
gridwidth
=

1

;

gbc
.
gridheight
=

1

;

gbc
.
anchor
=

GridBagConstraints
.
NORTH
;

gbl
.
setConstraints
(
processesPanel
,
gbc
)

;

gbc
.
gridx
=

2

;

gbc
.
gridy
=

2

;

gbc
.
gridwidth
=

1

;

gbc
.
gridheight
=

1

;

gbc
.
anchor
=

GridBagConstraints
.
NORTH
;

gbl
.
setConstraints
(
resourcesPanel
,
gbc
)

;

add
(
topPanel
)

;

add
(
processesPanel
);

add
(
resourcesPanel
);

setLayout
(
gbl
)

;

kernel
.
reset
();

hasBeenReset
=

true

;

pack
()

;

Dimension
screenSize
=
getToolkit
().
getScreenSize
()

;

Dimension
size
=
getSize
()

;

setLocation
(

(
screenSize
.
width
–
size
.
width
+

1

)

/

2

,

(
screenSize
.
height
–
size
.
height
+

1

)

/

2

)

;

show
()

;

requestFocus
()

;

}

public

boolean
action
(

Event
e
,

Object
arg
)

{

if

(
e
.
target
==
runButton
)

{

if

(

!
hasBeenReset
)

{

kernel
.
reset
()

;

hasBeenReset
=

true

;

}

runButton
.
disable
()

;

stopButton
.
enable
()

;

stepButton
.
disable
()

;

resetButton
.
disable
()

;

optionsButton
.
disable
()

;

processesButton
.
disable
()

;

resourcesButton
.
disable
()

;

stopButton
.
requestFocus
()

;

kernel
.
setStepping
(
false
);

kernel
.
resume
()

;

running
=

true

;

return

true

;

}

else

if

(
e
.
target
==
stopButton
)

{

stopAction
()

;

kernel
.
suspend
();

return

true

;

}

else

if

(
e
.
target
==
stepButton
)

{

if

(

!
hasBeenReset
)

{

kernel
.
reset
()

;

hasBeenReset
=

true

;

}

kernel
.
setStepping
(
true
);

kernel
.
resume
()

;

return

true

;

}

else

if

(
e
.
target
==
resetButton
)

{

kernel
.
reset
();

hasBeenReset
=

true

;

return

true

;

}

else

if

(
e
.
target
==
optionsButton
)

{

optionsDialog
.
setProcessCount
(
kernel
.
getProcessCount
()

)

;

optionsDialog
.
setResourceCount
(
kernel
.
getResourceCount
()

)

;

optionsDialog
.
setSleepTime
(
kernel
.
getSleepTime
()

)

;

optionsDialog
.
show
();

kernel
.
setProcessCount
(
optionsDialog
.
getProcessCount
(

)

)

;

kernel
.
setResourceCount
(
optionsDialog
.
getResourceCount
()

)

;

kernel
.
setSleepTime
(
optionsDialog
.
getSleepTime
()

)

;

processesPanel
.
setProcessCount
(
optionsDialog
.
getProcessCount
()

)

;

resourcesPanel
.
setResourceCount
(
optionsDialog
.
getResourceCount
()

)

;

hasBeenReset
=

false

;

return

true

;

}

else

if

(
e
.
target
==
processesButton
)

{

processesDialog
.
setProcessCount
(
kernel
.
getProcessCount
()

)

;

processesDialog
.
show
()

;

processesPanel
.
show
()

;

hasBeenReset
=

false

;

return

true

;

}

else

if

(
e
.
target
==
resourcesButton
)

{

resourcesDialog
.
setResourceCount
(
kernel
.
getResourceCount
()

)

;

resourcesDialog
.
show
()

;

resourcesPanel
.
show
()

;

hasBeenReset
=

false

;

return

true

;

}

else

if

(
e
.
target
==
exitButton
)

{

kernel
.
stop
();

System
.
exit
(
0
);

return

true

;

}

else

{

System
.
out
.
println
(
e
.
toString
()

)

;

return

false

;

}

public

void
stopAction
()

{

runButton
.
enable
()

;

stopButton
.
disable
()

;

stepButton
.
enable
()

;

resetButton
.
enable
()

;

optionsButton
.
enable
()

;

processesButton
.
enable
()

;

resourcesButton
.
enable
()

;

runButton
.
requestFocus
()

;

running
=

false

;

}

public

void
setTime
(

int
newTime
)

{

Dimension
oldSize
=
timeValueLabel
.
getSize
()

;

timeValueLabel
.
setText
(

Integer
.
toString
(
newTime
)

)

;

Dimension
newSize
=
timeValueLabel
.
getMinimumSize
()

;

if

(
newSize
.
width
>
oldSize
.
width
)

timeValueLabel
.
invalidate
();

}

public

void
setProcessId
(

int
i
,

int
newId
)

{

processesPanel
.
setProcessId
(
i
,
newId
)

;

}

public

void
setProcessState
(

int
i
,

String
newState
)

{

processesPanel
.
setProcessState
(
i
,
newState
)

;

}

public

void
setProcessResource
(

int
i
,

String
newResource
)

{

processesPanel
.
setProcessResource
(
i
,
newResource
)

;

}

public

void
setResourceId
(

int
i
,

int
newId
)

{

resourcesPanel
.
setResourceId
(
i
,
newId
)

;

}

public

void
setResourceAvailable
(

int
i
,

int
newAvailable
)

{

resourcesPanel
.
setResourceAvailable
(
i
,
newAvailable
)

;

}

__MACOSX/deadlock/._ControlPanel.java

deadlock/DatFilenameFilter.java

import
java
.
io
.
File

;

import
java
.
io
.
FilenameFilter

;

public

class

DatFilenameFilter

implements

FilenameFilter

{

public

DatFilenameFilter
()

{

super
()

;

}

public

boolean
accept
(

File
dir
,

String
name
)

{

return
name
.
endsWith
(
“.dat”
);

}

deadlock/deadlock.java

public

class
deadlock

{

/**

This is main method that runs the application. Any number of arguments may be

specified on the command line, but none are required. The first

argument is the number of processes to create, while the subsequent

arguments are the number of each resource is initially available.

public

static

void
main
(

String
args
[]

)

{

ControlPanel
controlPanel
;

Kernel
kernel
;

kernel
=

new

Kernel
()

;

if

(
args
.
length
==

0

)

{

System
.
out
.
println
(

“Usage: java deadlock … ”
);

System
.
exit
(

0

)

;

}

if

(
args
.
length
>

0

)

{

kernel
.
setProcessFilenamePrefix
(
args
[
0
]

)

;

}

if

(
args
.
length
>

1

)

try

{

kernel
.
setProcessCount
(

Integer
.
valueOf
(
args
[
1
]).
intValue
()

)

;

}

catch

(
NumberFormatException
e
)

{

System
.
err
.
println
(

“Invalid number \””

+
args
[
1
]

+

“\” specified as process count”
);

System
.
exit
(
0
);

}

if

(
args
.
length
>

2

)

{

kernel
.
setResourceCount
(
args
.
length
–

2

)

;

for
(

int
i
=

2

;
i
< args . length ; i ++ ) try { kernel . setResourceInitialAvailable ( i - 2 , Integer . valueOf ( args [ i ]). intValue () ) ; } catch ( NumberFormatException e ) { System . err . println ( "Invalid number \"" + args [ i ] + "\" specified as count of available resources for resource " + Integer . toString ( i - 2 ) + "." ) ; System . exit ( 0 ); } } kernel . start (); controlPanel = new ControlPanel ( "deadlock" ); controlPanel . init ( kernel ); } } __MACOSX/deadlock/._deadlock.java deadlock/DeadlockManager.java deadlock/DeadlockManager.java import java . util . Vector ; public class DeadlockManager { private static Vector resources ; private static Vector processes ; public static void setResources ( Vector newResources ) { resources = newResources ; } public static void setProcesses ( Vector newProcesses ) { processes = newProcesses ; } /** Can the process be granted the resource? */ public static boolean grantable ( int id , Resource resource ) { return available ( id , resource ) ; } public static boolean available ( int id , Resource resource ) { return ( resource . getCurrentAvailable () >

0

)

;

}

public

static

void
allocate
(

int
id
,

Resource
resource
)

{

resource
.
setCurrentAvailable
(
resource
.
getCurrentAvailable
()

–

1

)

;

// we also need to note that the process has the resource allocated to it

Process
p
=

(
Process
)
processes
.
elementAt
(
id
);

p
.
addAllocatedResource
(
resource
)

;

}

public

static

void
deallocate
(

int
id
,

Resource
resource
)

{

resource
.
setCurrentAvailable
(
resource
.
getCurrentAvailable
()

+

1

)

;

// we also need to note that this process no longer has the resource allocated

Process
p
=

(
Process
)
processes
.
elementAt
(
id
);

p
.
removeAllocatedResource
(
resource
)

;

}

/**

all processes are blocked. One of them should be killed

and its resources deallocated so that the others can continue.

public

static

void
deadlocked
()

{

}

__MACOSX/deadlock/._DeadlockManager.java

deadlock/Kernel.java

import
java
.
util
.
Vector

;

import
java
.
lang
.
Thread

;

import
java
.
io
.
IOException

;

public

class

Kernel

extends

Thread

{

private

int
time
=

0

;

private

int
sleepTime
=

1000

;

private

String
processFilenamePrefix
=

“process”

;

private

String
processFilenameSuffix
=

“.dat”

;

private

int
processCount
;

private

int
resourceCount
;

private

Vector
processes
=

new

Vector
()

;

private

Vector
resources
=

new

Vector
()

;

private

ControlPanel
controlPanel
;

private

boolean
stepping
=

false

;

private

int
haltedCount
=

0

;

private

int
blockedCount
=

0

;

public

void
processHalted
()

{

haltedCount
++

;

}

public

void
processBlocked
()

{

blockedCount
++

;

}

public

void
processUnblocked
()

{

blockedCount
—

;

}

public

void
setProcessFilenamePrefix
(

String
newProcessFilenamePrefix
)

{

processFilenamePrefix
=
newProcessFilenamePrefix
;

}

public

String
getProcessFilenamePrefix
(

)

{

return
processFilenamePrefix
;

}

public

void
setProcessFilenameSuffix
(

String
newProcessFilenameSuffix
)

{

processFilenameSuffix
=
newProcessFilenameSuffix
;

}

public

String
getProcessFilenameSuffix
(

)

{

return
processFilenameSuffix
;

}

public

boolean
getStepping
()

{

return
stepping
;

}

public

void
setStepping
(

boolean
newStepping
)

{

stepping
=
newStepping
;

}

public

int
getTime
(

)

{

return
time
;

}

public

void
setTime
(

int
newTime
)

{

time
=
newTime
;

}

public

int
getSleepTime
()

{

return
sleepTime
;

}

public

void
setSleepTime
(

int
newSleepTime
)

{

sleepTime
=
newSleepTime
;

}

public

void
setControlPanel
(

ControlPanel
newControlPanel
)

{

controlPanel
=
newControlPanel
;

}

public

void
setProcessCount
(

int
newProcessCount
)

{

if

(
newProcessCount
>
processCount
)

{

for

(

int
i
=
processCount
;
i
< newProcessCount ; i ++ ) processes . addElement ( new Process ( i , processFilenamePrefix + Integer . toString ( i ) + processFilenameSuffix ) ) ; } else if ( newProcessCount < processCount ) { processes . setSize ( newProcessCount ) ; processes . trimToSize ( ) ; } processCount = newProcessCount ; } public int getProcessCount ( ) { return processCount ; } public void setResourceCount ( int newResourceCount ) { if ( newResourceCount >
resourceCount
)

{

for

(

int
i
=
resourceCount
;
i
< newResourceCount ; i ++ ) resources . addElement ( new Resource ( i ) ) ; } else if ( newResourceCount < resourceCount ) { resources . setSize ( newResourceCount ) ; resources . trimToSize ( ) ; } resourceCount = newResourceCount ; } public int getResourceCount ( ) { return resourceCount ; } public void setResourceInitialAvailable ( int resource , int newInitialAvailable ) { (( Resource )( resources . elementAt ( resource ))). setInitialAvailable ( newInitialAvailable ) ; } public int getResourceInitialAvailable ( int resource ) { return (( Resource )( resources . elementAt ( resource ))). getInitialAvailable () ; } public Vector getResources () { return resources ; } public Vector getProcesses () { return processes ; } public void reset () { haltedCount = 0 ; blockedCount = 0 ; // reset all the resources for ( int i = 0 ; i < resourceCount ; i ++ ) { Resource resource = ( Resource ) resources . elementAt ( i ) ; resource . setCurrentAvailable ( resource . getInitialAvailable () ) ; } // reset all the processes for ( int i = 0 ; i < processCount ; i ++ ) { try { (( Process ) processes . elementAt ( i )). reset () ; } catch ( IOException e ) { // the reset should fail, and an error message should be displayed. System . out . println ( "unable to open \"" + (( Process ) processes . elementAt ( i )). getFilename () + "\" for input" ) ; } } time = 0 ; updateControlPanel (); } public void step () { for ( int i = 0 ; i < processCount ; i ++ ) { Process process = ( Process ) processes . elementAt ( i ) ; /** Perform one millisecond of work for a process. Note that only computable processes consume time. It does not take time to grant a resource, nor does it take time to free a resource. If a process is blocked, or if it is halted, the next process is free to execute. */ boolean running = true ; while ( running ) { switch ( process . state ) { case Process . STATE_UNKNOWN : Command command = process . cp . getCommand (); if ( command == null ) { process . state = Process . STATE_HALT ; processHalted () ; } else { String keyword = command . getKeyword () ; if ( keyword . equals ( "C" ) ) { process . timeToCompute = command . getParameter () ; process . state = Process . STATE_COMPUTABLE ; } else if ( keyword . equals ( "R" ) ) { // allocate resource command.getParameter() // state depends on whether the resource is available int r = command . getParameter () ; Resource resource = ( Resource ) resources . elementAt ( r ); if ( DeadlockManager . grantable ( i , resource ) ) DeadlockManager . allocate ( i , resource ) ; else { // increment blocked count processBlocked (); process . resourceAwaiting = resource ; process . state = Process . STATE_RESOURCE_WAIT ; if ( processCount == blockedCount ) { // deadlocked; have the deadlock manager kill a process. DeadlockManager . deadlocked (); } } } else if ( keyword . equals ( "F" ) ) { // free resource command.getParameter() (if it was allocated ???) // state depends on whether there are other commands int r = command . getParameter () ; Resource resource = ( Resource ) resources . elementAt ( r ); DeadlockManager . deallocate ( i , resource ) ; } else if ( keyword . equals ( "H" ) ) { process . state = Process . STATE_HALT ; processHalted () ; } } break ; case Process . STATE_COMPUTABLE : if ( process . timeToCompute >

0

)

{

process
.
timeToCompute
—

;

running
=

false

;

}

else

process
.
state
=

Process
.
STATE_UNKNOWN
;

break

;

case

Process
.
STATE_RESOURCE_WAIT
:

if

(

DeadlockManager
.
available
(
i
,
process
.
resourceAwaiting
)

)

if

(

DeadlockManager
.
grantable
(
i
,
process
.
resourceAwaiting
)

)

{

DeadlockManager
.
allocate
(
i
,
process
.
resourceAwaiting
)

;

process
.
state
=

Process
.
STATE_UNKNOWN
;

process
.
resourceAwaiting
=

null

;

processUnblocked
()

;

}

else

running
=

false

;

// continue to be blocked on this resource

else

running
=

false

;

// continue to be blocked on this resource

break

;

case

Process
.
STATE_HALT
:

// we’re already stopped, no need to do anything

running
=

false

;

break

;

}

printStatus
()

;

time
++

;

updateControlPanel
();

}

private

void
updateControlPanel
()

{

controlPanel
.
setTime
(
time
)

;

for
(

int
i
=

0

;
i
< resourceCount ; i ++ ) { Resource resource = ( Resource ) resources . elementAt ( i ) ; controlPanel . setResourceId ( i , resource . getId ()); controlPanel . setResourceAvailable ( i , resource . getCurrentAvailable () ) ; } for ( int i = 0 ; i < processCount ; i ++ ) { Process process = ( Process ) processes . elementAt ( i ) ; controlPanel . setProcessId ( i , process . getId ()); controlPanel . setProcessState ( i , process . getState () ) ; Resource resourceAwaiting = process . getResourceAwaiting () ; if ( resourceAwaiting == null ) { controlPanel . setProcessResource ( i , "" ) ; } else { controlPanel . setProcessResource ( i , Integer . toString ( resourceAwaiting . getId ())) ; } } controlPanel . validate () ; } public void printStatus () { System . out . print ( "time = " + time + " available =" ) ; for ( int i = 0 ; i < resourceCount ; i ++ ) { Resource resource = (( Resource ) resources . elementAt ( i )) ; System . out . print ( " " + resource . getCurrentAvailable () ) ; } System . out . println ( " blocked = " + blockedCount ) ; } public void run () { DeadlockManager . setProcesses ( processes ) ; DeadlockManager . setResources ( resources ) ; suspend () ; while ( true ) { step (); if ( stepping ) { suspend () ; } else { try { sleep ( sleepTime ) ; } catch ( InterruptedException e ) { // do nothing } } if ( processCount == haltedCount || processCount == blockedCount ) { controlPanel . stopAction () ; suspend () ; } } } } __MACOSX/deadlock/._Kernel.java deadlock/OptionsDialog.java deadlock/OptionsDialog.java import java . awt . * ; public class OptionsDialog extends Dialog { Button okButton ; Button cancelButton ; Label processLabel ; TextField processTextField ; Label resourceLabel ; TextField resourceTextField ; Label sleepTimeLabel ; TextField sleepTimeTextField ; Panel topPanel ; Panel bottomPanel ; Panel processPanel ; Panel resourcePanel ; Panel labelPanel ; Panel textFieldPanel ; private int processCount = 0 ; private int resourceCount = 0 ; private int sleepTime = 0 ; public OptionsDialog ( Frame parent ) { super ( parent ) ; init () ; } public OptionsDialog ( Frame parent , boolean modal ) { super ( parent , modal ) ; init () ; } public OptionsDialog ( Frame parent , String title ) { super ( parent , title ) ; init () ; } public OptionsDialog ( Frame parent , String title , boolean modal ) { super ( parent , title , modal ) ; init () ; } public void setProcessCount ( int newProcessCount ) { processCount = newProcessCount ; } public int getProcessCount ( ) { return processCount ; } public void setResourceCount ( int newResourceCount ) { resourceCount = newResourceCount ; } public int getResourceCount ( ) { return resourceCount ; } public void setSleepTime ( int newSleepTime ) { sleepTime = newSleepTime ; } public int getSleepTime ( ) { return sleepTime ; } public void init () { topPanel = new Panel () ; bottomPanel = new Panel () ; processLabel = new Label ( "Number of processes:" , Label . RIGHT ) ; processTextField = new TextField ( Integer . toString ( processCount ) ) ; resourceLabel = new Label ( "Number of resources:" , Label . RIGHT ) ; resourceTextField = new TextField ( Integer . toString ( resourceCount ) ) ; sleepTimeLabel = new Label ( "Milliseconds per step:" , Label . RIGHT ) ; sleepTimeTextField = new TextField ( Integer . toString ( sleepTime ) , 6 ) ; GridBagLayout gbl = new GridBagLayout () ; GridBagConstraints gbc = new GridBagConstraints () ; gbc . gridx = 1 ; gbc . gridy = 1 ; gbc . anchor = GridBagConstraints . EAST ; gbl . setConstraints ( processLabel , gbc ) ; gbc . gridx = 2 ; gbc . gridy = 1 ; gbc . anchor = GridBagConstraints . WEST ; gbl . setConstraints ( processTextField , gbc ) ; gbc . gridx = 1 ; gbc . gridy = 2 ; gbc . anchor = GridBagConstraints . EAST ; gbl . setConstraints ( resourceLabel , gbc ) ; gbc . gridx = 2 ; gbc . gridy = 2 ; gbc . anchor = GridBagConstraints . WEST ; gbl . setConstraints ( resourceTextField , gbc ) ; gbc . gridx = 1 ; gbc . gridy = 3 ; gbc . anchor = GridBagConstraints . EAST ; gbl . setConstraints ( sleepTimeLabel , gbc ) ; gbc . gridx = 2 ; gbc . gridy = 3 ; gbc . anchor = GridBagConstraints . WEST ; gbl . setConstraints ( sleepTimeTextField , gbc ) ; topPanel . setLayout ( gbl ); topPanel . add ( processLabel ); topPanel . add ( processTextField ); topPanel . add ( resourceLabel ); topPanel . add ( resourceTextField ); topPanel . add ( sleepTimeLabel ) ; topPanel . add ( sleepTimeTextField ) ; okButton = new Button ( "ok" ) ; cancelButton = new Button ( "cancel" ) ; bottomPanel . add ( okButton ); bottomPanel . add ( cancelButton ); setLayout ( new BorderLayout () ) ; add ( "North" , topPanel ) ; add ( "South" , bottomPanel ); pack (); } public void show () { processTextField . setText ( Integer . toString ( processCount ) ) ; resourceTextField . setText ( Integer . toString ( resourceCount ) ) ; sleepTimeTextField . setText ( Integer . toString ( sleepTime ) ) ; pack (); super . show (); } public boolean action ( Event e , Object arg ) { if ( e . target == okButton ) { try { setProcessCount ( Integer . valueOf ( processTextField . getText (). trim ()). intValue () ) ; try { setResourceCount ( Integer . valueOf ( resourceTextField . getText (). trim ()). intValue () ) ; try { setSleepTime ( Integer . valueOf ( sleepTimeTextField . getText (). trim ()). intValue () ) ; this . hide (); } catch ( NumberFormatException exception ) { System . out . println ( "invalid number value entered for sleep time" ); sleepTimeTextField . requestFocus (); } } catch ( NumberFormatException exception ) { System . out . println ( "invalid number value entered for resource count" ); resourceTextField . requestFocus (); } } catch ( NumberFormatException exception ) { System . out . println ( "invalid number value entered for process count" ); processTextField . requestFocus (); } return true ; } else if ( e . target == cancelButton ) { this . hide () ; return true ; } else if ( e . target == processTextField ) { processTextField . setText ( processTextField . getText (). trim () ) ; return true ; } else if ( e . target == resourceTextField ) { resourceTextField . setText ( resourceTextField . getText (). trim () ) ; return true ; } else return false ; } } deadlock/Process.java deadlock/Process.java import java . io . * ; import java . util . Vector ; public class Process { public static final int STATE_UNKNOWN = 0 ; public static final int STATE_COMPUTABLE = 1 ; public static final int STATE_RESOURCE_WAIT = 2 ; public static final int STATE_HALT = 3 ; protected int id ; private String filename = "" ; protected CommandParser cp ; protected int state = STATE_UNKNOWN ; protected int timeToCompute ; protected Resource resourceAwaiting = null ; private Vector allocatedResources = new Vector () ; public Process ( int newId , String newFilename ) { super () ; id = newId ; filename = newFilename ; } public int getId () { return id ; } public String getFilename () { return filename ; } public void setFilename ( String newFilename ) { filename = newFilename ; } public String getState () { switch ( state ) { case STATE_RESOURCE_WAIT : return "W" ; case STATE_COMPUTABLE : return "C" ; case STATE_HALT : return "H" ; default : return "U" ; } } public void setState ( String newState ) { if ( newState . equals ( "W" ) ) state = STATE_RESOURCE_WAIT ; else if ( newState . equals ( "C" ) ) state = STATE_COMPUTABLE ; else if ( newState . equals ( "H" ) ) state = STATE_HALT ; else state = STATE_UNKNOWN ; } public Resource getResourceAwaiting ( ) { return resourceAwaiting ; } public void addAllocatedResource ( Resource resource ) { allocatedResources . addElement ( resource ) ; } public void removeAllocatedResource ( Resource resource ) { allocatedResources . removeElement ( resource ) ; } public void reset () throws IOException { cp = new CommandParser ( new BufferedInputStream ( new FileInputStream ( filename ) ) ) ; state = STATE_UNKNOWN ; timeToCompute = 0 ; resourceAwaiting = null ; allocatedResources . removeAllElements () ; } } __MACOSX/deadlock/._Process.java deadlock/ProcessesDialog.java deadlock/ProcessesDialog.java import java . awt . BorderLayout ; import java . awt . Button ; import java . awt . Dialog ; import java . awt . Event ; import java . awt . FileDialog ; import java . awt . Frame ; import java . awt . GridBagConstraints ; import java . awt . GridBagLayout ; import java . awt . Label ; import java . awt . Panel ; import java . awt . TextField ; import java . io . File ; import java . util . Vector ; public class ProcessesDialog extends Dialog { Button okButton ; Button cancelButton ; Label processLabel ; Label processCountLabel ; Panel topPanel ; Panel processPanel ; Panel bottomPanel ; FileDialog fileDialog = null ; int processCount ; Vector processes ; Vector filenames = new Vector () ; Vector chooseButtons = new Vector () ; public ProcessesDialog ( Frame parent ) { super ( parent ) ; init () ; } public ProcessesDialog ( Frame parent , boolean modal ) { super ( parent , modal ) ; init () ; } public ProcessesDialog ( Frame parent , String title ) { super ( parent , title ) ; init () ; } public ProcessesDialog ( Frame parent , String title , boolean modal ) { super ( parent , title , modal ) ; init () ; } public void setProcessCount ( int newProcessCount ) { processCount = newProcessCount ; filenames . setSize ( newProcessCount ); filenames . trimToSize (); } public void setProcesses ( Vector newProcesses ) { processes = newProcesses ; } public void init () { topPanel = new Panel () ; processPanel = new Panel () ; bottomPanel = new Panel () ; processLabel = new Label ( "Number of processes:" , Label . RIGHT ) ; processCountLabel = new Label ( Integer . toString ( processCount ) , Label . LEFT ) ; topPanel . add ( processLabel ); topPanel . add ( processCountLabel ) ; okButton = new Button ( "ok" ) ; cancelButton = new Button ( "cancel" ) ; bottomPanel . add ( okButton ); bottomPanel . add ( cancelButton ) ; setLayout ( new BorderLayout () ) ; add ( "North" , topPanel ) ; add ( "Center" , processPanel ) ; add ( "South" , bottomPanel ); } public void show () { GridBagLayout gbl = new GridBagLayout () ; GridBagConstraints gbc = new GridBagConstraints (); processCountLabel . setText ( Integer . toString ( processCount ) ) ; // we need to add the correct number of lines to the panel processPanel . removeAll (); Label idLabel = new Label ( "Id" , Label . RIGHT ); processPanel . add ( idLabel ); gbc . gridx = 1 ; gbc . gridy = 1 ; gbc . fill = GridBagConstraints . NONE ; gbl . setConstraints ( idLabel , gbc ) ; Label filenameLabel = new Label ( "File Name" , Label . LEFT ) ; processPanel . add ( filenameLabel ); gbc . gridx = 2 ; gbc . gridy = 1 ; gbl . setConstraints ( filenameLabel , gbc ) ; for ( int i = 0 ; i < processCount ; i ++ ) { Process process = ( Process ) processes . elementAt ( i ) ; // add the vector of labels and fields here idLabel = new Label ( Integer . toString ( i ) , Label . RIGHT ) ; processPanel . add ( idLabel ); gbc . gridx = 1 ; gbc . gridy = 2 + i ; gbl . setConstraints ( idLabel , gbc ) ; TextField filenameTextField = new TextField ( process . getFilename () , 32 ) ; filenames . insertElementAt ( filenameTextField , i ); processPanel . add ( filenameTextField ); gbc . gridx = 2 ; gbc . gridy = 2 + i ; gbl . setConstraints ( filenameTextField , gbc ) ; Button chooseButton = new Button ( "choose" ) ; chooseButtons . insertElementAt ( chooseButton , i ); processPanel . add ( chooseButton ); gbc . gridx = 3 ; gbc . gridy = 2 + i ; gbl . setConstraints ( chooseButton , gbc ) ; if ( i == 0 ) { filenameTextField . requestFocus () ; } } if ( processCount == 0 ) { cancelButton . requestFocus () ; } processPanel . setLayout ( gbl ); pack () ; super . show (); } public boolean action ( Event e , Object arg ) { if ( e . target == okButton ) { // before we set the processes, we need to validate the values ??? for ( int i = 0 ; i < processCount ; i ++ ) { Process process ; process = ( Process ) processes . elementAt ( i ) ; process . setFilename ((( TextField ) filenames . elementAt ( i )). getText ()); } this . hide () ; return true ; } else if ( e . target == cancelButton ) { this . hide () ; return true ; } else if ( (( String ) e . arg ). equals ( "choose" ) ) { // find the TextField corresponding to the button and update its value // if a value returned by the dialog. int i = chooseButtons . indexOf ( e . target ) ; File f = new File ( (( TextField ) filenames . elementAt ( i )). getText () ) ; if ( fileDialog == null ) { fileDialog = new FileDialog ( ( Frame ) this . getParent () , "Open file" , FileDialog . LOAD ) ; } fileDialog . setDirectory ( f . getParent () ) ; fileDialog . setFile ( f . getName () ) ; // filename filtering may not be implemented on all platforms. // fileDialog.setFilenameFilter( new DatFilenameFilter() ) ; fileDialog . show (); if ( fileDialog . getFile () != null ) { TextField filenameTextField = (( TextField ) filenames . elementAt ( i )); String filename = fileDialog . getDirectory () + fileDialog . getFile () ; filenameTextField . setText ( filename ) ; } return true ; } else return false ; } } __MACOSX/deadlock/._ProcessesDialog.java deadlock/ProcessesPanel.java deadlock/ProcessesPanel.java import java . awt . Dimension ; import java . awt . Panel ; import java . awt . GridBagConstraints ; import java . awt . GridBagLayout ; import java . awt . Label ; import java . util . Vector ; public class ProcessesPanel extends Panel { Label processesLabel ; Label idLabel ; Label stateLabel ; Label resourceLabel ; int processCount = 0 ; Vector processes ; Vector processIdLabelVector = new Vector (); Vector processStateLabelVector = new Vector () ; Vector processResourceLabelVector = new Vector () ; ProcessesPanel () { super () ; } ProcessesPanel ( int newProcessCount ) { super () ; processCount = newProcessCount ; } public void setProcessCount ( int newProcessCount ) { if ( newProcessCount >
processCount
)

{

GridBagConstraints
gbc
;

GridBagLayout
gbl
=

(
GridBagLayout
)
this
.
getLayout
();

Label
idLabel
;

Label
availableLabel
;

// add the objects to the vector

// add the objects to the panel

for

(

int
i
=
processCount
;
i
< newProcessCount ; i ++ ) { idLabel = new Label ( ) ; stateLabel = new Label ( ); resourceLabel = new Label ( ) ; // add the objects to the vector processIdLabelVector . insertElementAt ( idLabel , i ) ; processStateLabelVector . insertElementAt ( stateLabel , i ) ; processResourceLabelVector . insertElementAt ( resourceLabel , i ) ; // add the constraints to the layout manager gbc = new GridBagConstraints () ; gbc . gridx = 1 ; gbc . gridy = 3 + i ; gbl . setConstraints ( idLabel , gbc ) ; gbc = new GridBagConstraints () ; gbc . gridx = 2 ; gbc . gridy = 3 + i ; gbl . setConstraints ( stateLabel , gbc ) ; gbc = new GridBagConstraints () ; gbc . gridx = 3 ; gbc . gridy = 3 + i ; gbl . setConstraints ( resourceLabel , gbc ) ; // add the objects to the panel this . add ( idLabel ); this . add ( stateLabel ) ; this . add ( resourceLabel ) ; } // redo the layout of the panel this . layout (); } else if ( newProcessCount < processCount ) { for ( int i = processCount - 1 ; i >=
newProcessCount
;
i
—

)

{

// remove the objects from the panel

this
.
remove
(

(
Label
)
processIdLabelVector
.
elementAt
(
i
)

)

;

this
.
remove
(

(
Label
)
processStateLabelVector
.
elementAt
(
i
)

)

;

this
.
remove
(

(
Label
)
processResourceLabelVector
.
elementAt
(
i
)

)

;

// remove the objects from the vector

processIdLabelVector
.
removeElementAt
(
i
)

;

processStateLabelVector
.
removeElementAt
(
i
)

;

processResourceLabelVector
.
removeElementAt
(
i
)

;

}

// redo the layout of the panel

this
.
layout
();

}

processCount
=
newProcessCount
;

}

public

void
setProcesses
(
Vector
newProcesses
)

{

processes
=
newProcesses
;

}

public

void
setProcessId
(

int
i
,

int
newId
)

{

Label
label
=

(
Label
)
processIdLabelVector
.
elementAt
(
i
)

;

Dimension
oldSize
=
label
.
getSize
()

;

label
.
setText
(
Integer
.
toString
(
newId
))

;

Dimension
newSize
=
label
.
getMinimumSize
()

;

if

(
newSize
.
width
>
oldSize
.
width
)

label
.
invalidate
();

}

public

void
setProcessState
(

int
i
,

String
newState
)

{

Label
label
=

(
Label
)
processStateLabelVector
.
elementAt
(
i
)

;

Dimension
oldSize
=
label
.
getSize
()

;

label
.
setText
(
newState
)

;

Dimension
newSize
=
label
.
getMinimumSize
()

;

if

(
newSize
.
width
>
oldSize
.
width
)

label
.
invalidate
();

}

public

void
setProcessResource
(

int
i
,

String
newResourceName
)

{

Label
label
=

(
Label
)
processResourceLabelVector
.
elementAt
(
i
)

;

Dimension
oldSize
=
label
.
getSize
()

;

label
.
setText
(
newResourceName
)

;

Dimension
newSize
=
label
.
getMinimumSize
()

;

if

(
newSize
.
width
>
oldSize
.
width
)

label
.
invalidate
();

}

public

void
init
()

{

GridBagConstraints
gbc
;

GridBagLayout
gbl
=

new

GridBagLayout
()

;

processesLabel
=

new

Label
(

“Processes”

)

;

gbc
=

new

GridBagConstraints
()

;

gbc
.
gridx
=

1

;

gbc
.
gridy
=

1

;

gbc
.
gridwidth
=

GridBagConstraints
.
REMAINDER
;

gbl
.
setConstraints
(
processesLabel
,
gbc
)

;

idLabel
=

new

Label
(

“Id”

,

Label
.
RIGHT
)

;

gbc
=

new

GridBagConstraints
()

;

gbc
.
gridx
=

1

;

gbc
.
gridy
=

2

;

gbl
.
setConstraints
(
idLabel
,
gbc
)

;

stateLabel
=

new

Label
(

“State”

,

Label
.
RIGHT
)

;

gbc
=

new

GridBagConstraints
()

;

gbc
.
gridx
=

2

;

gbc
.
gridy
=

2

;

gbl
.
setConstraints
(
stateLabel
,
gbc
)

;

resourceLabel
=

new

Label
(

“Resource”

,

Label
.
RIGHT
)

;

gbc
=

new

GridBagConstraints
()

;

gbc
.
gridx
=

3

;

gbc
.
gridy
=

2

;

gbl
.
setConstraints
(
resourceLabel
,
gbc
)

;

for
(

int
i
=

0

;
i
< processCount ; i ++ ) { Label idLabel ; Label stateLabel ; Label resourceLabel ; // create the labels // add labels to the vectors // set constraints idLabel = new Label ( Integer . toString ( i ) , Label . RIGHT ) ; processIdLabelVector . insertElementAt ( idLabel , i ) ; gbc = new GridBagConstraints () ; gbc . gridx = 1 ; gbc . gridy = 3 + i ; gbl . setConstraints ( idLabel , gbc ) ; stateLabel = new Label ( ) ; stateLabel . setAlignment ( Label . RIGHT ) ; processStateLabelVector . insertElementAt ( stateLabel , i ) ; gbc = new GridBagConstraints () ; gbc . gridx = 2 ; gbc . gridy = 3 + i ; gbl . setConstraints ( stateLabel , gbc ) ; resourceLabel = new Label ( ) ; resourceLabel . setAlignment ( Label . RIGHT ) ; processResourceLabelVector . insertElementAt ( resourceLabel , i ) ; gbc = new GridBagConstraints () ; gbc . gridx = 3 ; gbc . gridy = 3 + i ; gbl . setConstraints ( resourceLabel , gbc ) ; } setLayout ( gbl ) ; add ( processesLabel ) ; add ( idLabel ) ; add ( stateLabel ) ; add ( resourceLabel ) ; for ( int i = 0 ; i < processCount ; i ++ ) { Label idLabel ; Label availableLabel ; idLabel = ( Label ) processIdLabelVector . elementAt ( i ); stateLabel = ( Label ) processStateLabelVector . elementAt ( i ) ; resourceLabel = ( Label ) processResourceLabelVector . elementAt ( i ) ; add ( idLabel ) ; add ( stateLabel ) ; add ( resourceLabel ) ; } } public void show () { for ( int i = 0 ; i < processCount ; i ++ ) { Process process = ( Process ) processes . elementAt ( i ) ; (( Label ) processStateLabelVector . elementAt ( i )). setText ( process . getState ()) ; Resource resource = process . getResourceAwaiting () ; String resourceText = "" ; if ( resource != null ) resourceText = Integer . toString ( resource . getId ()) ; (( Label ) processResourceLabelVector . elementAt ( i )). setText ( resourceText ); } super . show (); } } deadlock/Resource.java deadlock/Resource.java public class Resource { int id = 0 ; private int initialAvailable = 0 ; private int currentAvailable = 0 ; public Resource ( int newId ) { super () ; id = newId ; } public int getId () { return id ; } public void setInitialAvailable ( int i ) { initialAvailable = i ; } public int getInitialAvailable () { return initialAvailable ; } public void setCurrentAvailable ( int i ) { currentAvailable = i ; } public int getCurrentAvailable () { return currentAvailable ; } public void reset () { currentAvailable = initialAvailable ; } public void allocate () { currentAvailable -- ; } public void deallocate () { currentAvailable ++ ; } } deadlock/ResourcesDialog.java deadlock/ResourcesDialog.java import java . awt . BorderLayout ; import java . awt . Button ; import java . awt . Dialog ; import java . awt . Event ; import java . awt . Frame ; import java . awt . GridBagConstraints ; import java . awt . GridBagLayout ; import java . awt . Label ; import java . awt . Panel ; import java . awt . TextField ; import java . util . Vector ; public class ResourcesDialog extends Dialog { Button okButton ; Button cancelButton ; Label resourceLabel ; Label resourceCountLabel ; Panel topPanel ; Panel resourcePanel ; Panel bottomPanel ; int resourceCount ; Vector resources ; Vector initials = new Vector () ; Vector currents = new Vector () ; public ResourcesDialog ( Frame parent ) { super ( parent ) ; init () ; } public ResourcesDialog ( Frame parent , boolean modal ) { super ( parent , modal ) ; init () ; } public ResourcesDialog ( Frame parent , String title ) { super ( parent , title ) ; init () ; } public ResourcesDialog ( Frame parent , String title , boolean modal ) { super ( parent , title , modal ) ; init () ; } public void setResourceCount ( int newResourceCount ) { resourceCount = newResourceCount ; initials . setSize ( newResourceCount ); initials . trimToSize (); currents . setSize ( newResourceCount ); currents . trimToSize (); } public void setResources ( Vector newResources ) { resources = newResources ; } public void init () { topPanel = new Panel () ; resourcePanel = new Panel () ; bottomPanel = new Panel () ; resourceLabel = new Label ( "Number of resources:" , Label . RIGHT ) ; resourceCountLabel = new Label ( Integer . toString ( resourceCount ) , Label . LEFT ) ; topPanel . add ( resourceLabel ); topPanel . add ( resourceCountLabel ) ; okButton = new Button ( "ok" ) ; bottomPanel . add ( okButton ); cancelButton = new Button ( "cancel" ) ; bottomPanel . add ( cancelButton ) ; setLayout ( new BorderLayout () ) ; add ( "North" , topPanel ) ; add ( "Center" , resourcePanel ) ; add ( "South" , bottomPanel ); } public void show () { GridBagLayout gbl = new GridBagLayout () ; GridBagConstraints gbc = new GridBagConstraints (); resourceCountLabel . setText ( Integer . toString ( resourceCount ) ) ; // we need to add the correct number of lines to the panel resourcePanel . removeAll (); Label idLabel = new Label ( "Id" , Label . RIGHT ); resourcePanel . add ( idLabel ); gbc . gridx = 1 ; gbc . gridy = 1 ; gbc . fill = GridBagConstraints . NONE ; gbl . setConstraints ( idLabel , gbc ) ; Label availableLabel = new Label ( "Initial" , Label . RIGHT ) ; resourcePanel . add ( availableLabel ); gbc . gridx = 2 ; gbc . gridy = 1 ; gbl . setConstraints ( availableLabel , gbc ) ; availableLabel = new Label ( "Current" , Label . RIGHT ) ; resourcePanel . add ( availableLabel ); gbc . gridx = 3 ; gbc . gridy = 1 ; gbl . setConstraints ( availableLabel , gbc ) ; for ( int i = 0 ; i < resourceCount ; i ++ ) { Resource resource = ( Resource ) resources . elementAt ( i ) ; // add the vector of labels and fields here idLabel = new Label ( Integer . toString ( i ) , Label . RIGHT ) ; resourcePanel . add ( idLabel ); gbc . gridx = 1 ; gbc . gridy = 2 + i ; gbl . setConstraints ( idLabel , gbc ) ; TextField availableTextField = new TextField ( Integer . toString ( resource . getInitialAvailable ()) ) ; initials . insertElementAt ( availableTextField , i ); resourcePanel . add ( availableTextField ); gbc . gridx = 2 ; gbc . gridy = 2 + i ; gbl . setConstraints ( availableTextField , gbc ) ; availableTextField = new TextField ( Integer . toString ( resource . getCurrentAvailable ()) ) ; currents . insertElementAt ( availableTextField , i ); resourcePanel . add ( availableTextField ); gbc . gridx = 3 ; gbc . gridy = 2 + i ; gbl . setConstraints ( availableTextField , gbc ) ; if ( i == 0 ) { availableTextField . requestFocus () ; } } if ( resourceCount == 0 ) { cancelButton . requestFocus () ; } resourcePanel . setLayout ( gbl ); pack () ; super . show (); } public boolean action ( Event e , Object arg ) { if ( e . target == okButton ) { // before we set the resources, we need to validate the values ??? for ( int i = 0 ; i < resourceCount ; i ++ ) { Resource resource ; resource = ( Resource ) resources . elementAt ( i ) ; try { resource . setInitialAvailable ( Integer . parseInt ((( TextField ) initials . elementAt ( i )). getText ())); } catch ( NumberFormatException exception ) { System . out . println ( "invalid number value entered for resource initial available" ); (( TextField ) initials . elementAt ( i )). requestFocus (); return true ; } try { resource . setCurrentAvailable ( Integer . parseInt ((( TextField ) currents . elementAt ( i )). getText ())); } catch ( NumberFormatException exception ) { System . out . println ( "invalid number value entered for resource current available" ); (( TextField ) currents . elementAt ( i )). requestFocus (); return true ; } } this . hide () ; return true ; } else if ( e . id == Event . ACTION_EVENT && e . target instanceof TextField ) { try { int i = Integer . valueOf ((( TextField ) e . target ). getText (). trim ()). intValue () ; this . hide (); } catch ( NumberFormatException exception ) { System . out . println ( "invalid number value entered for resource available" ); (( TextField ) e . target ). requestFocus (); } return true ; } else if ( e . target == cancelButton ) { this . hide () ; return true ; } else return false ; } } __MACOSX/deadlock/._ResourcesDialog.java deadlock/ResourcesPanel.java deadlock/ResourcesPanel.java import java . awt . Dimension ; import java . awt . Panel ; import java . awt . GridBagConstraints ; import java . awt . GridBagLayout ; import java . awt . Label ; import java . util . Vector ; public class ResourcesPanel extends Panel { Label resourcesLabel ; Label idLabel ; Label availableLabel ; int resourceCount = 0 ; Vector resources ; Vector resourceIdLabelVector = new Vector (); Vector resourceAvailableLabelVector = new Vector () ; ResourcesPanel () { super () ; } ResourcesPanel ( int newResourceCount ) { super () ; resourceCount = newResourceCount ; } public void setResourceCount ( int newResourceCount ) { if ( newResourceCount >
resourceCount
)

{

GridBagConstraints
gbc
;

GridBagLayout
gbl
=

(
GridBagLayout
)
this
.
getLayout
();

Label
idLabel
;

Label
availableLabel
;

// add the objects to the vector

// add the objects to the panel

for

(

int
i
=
resourceCount
;
i
< newResourceCount ; i ++ ) { idLabel = new Label ( ) ; availableLabel = new Label ( ); // add the objects to the vector resourceIdLabelVector . insertElementAt ( idLabel , i ) ; resourceAvailableLabelVector . insertElementAt ( availableLabel , i ) ; // add the constraints to the layout manager gbc = new GridBagConstraints () ; gbc . gridx = 1 ; gbc . gridy = 3 + i ; gbl . setConstraints ( idLabel , gbc ) ; gbc = new GridBagConstraints () ; gbc . gridx = 2 ; gbc . gridy = 3 + i ; gbl . setConstraints ( availableLabel , gbc ) ; // add the objects to the panel this . add ( idLabel ); this . add ( availableLabel ) ; } // redo the layout of the panel this . layout (); } else if ( newResourceCount < resourceCount ) { for ( int i = resourceCount - 1 ; i >=
newResourceCount
;
i
—

)

{

// remove the objects from the panel

this
.
remove
(

(
Label
)
resourceIdLabelVector
.
elementAt
(
i
)

)

;

this
.
remove
(

(
Label
)
resourceAvailableLabelVector
.
elementAt
(
i
)

)

;

// remove the objects from the vector

resourceIdLabelVector
.
removeElementAt
(
i
)

;

resourceAvailableLabelVector
.
removeElementAt
(
i
)

;

}

// redo the layout of the panel

this
.
layout
();

}

resourceCount
=
newResourceCount
;

}

public

void
setResources
(
Vector
newResources
)

{

resources
=
newResources
;

}

public

void
setResourceId
(

int
i
,

int
newId
)

{

Label
label
=

(
Label
)
resourceIdLabelVector
.
elementAt
(
i
)

;

Dimension
oldSize
=
label
.
getSize
()

;

label
.
setText
(
Integer
.
toString
(
newId
))

;

Dimension
newSize
=
label
.
getMinimumSize
()

;

if

(
newSize
.
width
>
oldSize
.
width
)

label
.
invalidate
()

;

}

public

void
setResourceAvailable
(

int
i
,

int
newAvailable
)

{

Label
label
=

(
Label
)
resourceAvailableLabelVector
.
elementAt
(
i
)

;

Dimension
oldSize
=
label
.
getSize
()

;

label
.
setText
(
Integer
.
toString
(
newAvailable
))

;

Dimension
newSize
=
label
.
getMinimumSize
()

;

if

(
newSize
.
width
>
oldSize
.
width
)

label
.
invalidate
()

;

}

public

void
init
()

{

GridBagConstraints
gbc
;

GridBagLayout
gbl
=

new

GridBagLayout
()

;

resourcesLabel
=

new

Label
(

“Resources”

)

;

gbc
=

new

GridBagConstraints
()

;

gbc
.
gridx
=

1

;

gbc
.
gridy
=

1

;

gbc
.
gridwidth
=

GridBagConstraints
.
REMAINDER
;

gbl
.
setConstraints
(
resourcesLabel
,
gbc
)

;

idLabel
=

new

Label
(

“Id”

,

Label
.
RIGHT
)

;

gbc
=

new

GridBagConstraints
()

;

gbc
.
gridx
=

1

;

gbc
.
gridy
=

2

;

gbl
.
setConstraints
(
idLabel
,
gbc
)

;

availableLabel
=

new

Label
(

“Available”

,

Label
.
RIGHT
)

;

gbc
=

new

GridBagConstraints
()

;

gbc
.
gridx
=

2

;

gbc
.
gridy
=

2

;

gbl
.
setConstraints
(
availableLabel
,
gbc
)

;

for
(

int
i
=

0

;
i
< resourceCount ; i ++ ) { Label idLabel ; Label availableLabel ; // create the labels // add labels to the vectors // set constraints idLabel = new Label ( ) ; idLabel . setAlignment ( Label . RIGHT ) ; resourceIdLabelVector . insertElementAt ( idLabel , i ) ; gbc = new GridBagConstraints () ; gbc . gridx = 1 ; gbc . gridy = 3 + i ; gbl . setConstraints ( idLabel , gbc ) ; availableLabel = new Label ( ) ; availableLabel . setAlignment ( Label . RIGHT ) ; resourceAvailableLabelVector . insertElementAt ( availableLabel , i ) ; gbc = new GridBagConstraints () ; gbc . gridx = 2 ; gbc . gridy = 3 + i ; gbl . setConstraints ( availableLabel , gbc ) ; } setLayout ( gbl ) ; add ( resourcesLabel ) ; add ( idLabel ) ; add ( availableLabel ) ; for ( int i = 0 ; i < resourceCount ; i ++ ) { Label idLabel ; Label availableLabel ; idLabel = ( Label ) resourceIdLabelVector . elementAt ( i ); availableLabel = ( Label ) resourceAvailableLabelVector . elementAt ( i ) ; add ( idLabel ) ; add ( availableLabel ) ; } } public void show () { for ( int i = 0 ; i < resourceCount ; i ++ ) { Resource resource = ( Resource ) resources . elementAt ( i ) ; (( Label ) resourceIdLabelVector . elementAt ( i )). setText ( Integer . toString ( resource . getId ())) ; (( Label ) resourceAvailableLabelVector . elementAt ( i )). setText ( Integer . toString ( resource . getCurrentAvailable ())) ; } super . show (); } } __MACOSX/Data structures assignment/._deadlock.zip Data structures assignment/week2c CI583: Data Structures and Operating Systems The stack 1 / 28 More simple data structures So far we have used two basic data structures: the array and the list. Both of these are general-purpose collection types, the main difference being that lists are more suitable when you don’t know in advance the exact size of a collection. Arrays are more suitable when random access to elements is required, since this is O(1) for arrays and O(n) for lists. 2 / 28 More simple data structures In this lecture we will examine some more specialised data structures, designed for particular tasks: the stack, the queue and the priority queue. In each case, the underlying storage mechanism might be an array or a list – it doesn’t matter to us as users of, say, the stack. All that matters is that the stack provides the methods and capabilities we expect from a stack. 3 / 28 The stack A stack is a collection type that allows us to push elements onto the front of it, pop (remove) elements from the front of it and, usually, to peek at the front element without removing it. We cannot access anything other than the first element. If we want to get access to the third element, we need to call pop three times. This method of access is called LIFO – last in, first out. 4 / 28 The stack Stacks are closely linked to low-level ways of interacting with computers and are used extensively in systems programming. Code that we write in a high-level language is compiled down to code that spends most of its time pushing and popping data from stacks. There are even stack-based programming languages such as Forth. 5 / 28 The stack Another important application for stacks is in writing parsers, which convert raw text input (such as a source file written using a programming language) into data structures with some particular meaning. One of the steps involved in this task is often to push each lexical token (in our source file these might include if, else, variable names etc) onto a stack. 6 / 28 The stack If we are implementing a stack using an array then the head of the stack isn’t necessarily the first element in the array (otherwise we would have to move elements every time we pushed or popped). Here is an empty stack with n elements. 0 1 2 n-1 ... head 7 / 28 The stack After calling stack.push(3): 0 1 2 n-1 ... head3 8 / 28 The stack stack.push(8), stack.push(99): 0 1 2 n-1 ... 3 8 head99 9 / 28 The stack At this point, pop() returns 99 and moves the position of the head. 0 1 2 n-1 ... 3 8 head 99 10 / 28 The stack Alternatively, we could implement the stack using a list. This is simpler, since we can make push and pop just operate on the head of the list so we don’t need to keep track of where the head is. . . . X 99 8 3 11 / 28 Abstract data types A stack of ints that uses an array to store the data: 1 class Stack { 2 private int[] data; 3 private int head; 4 5 public Stack(int n) { 6 data = new Object[n]; 7 head = -1; 8 } 9 public void push(int e) { 10 data [++ head] = e; // increment head then use its value 11 } 12 public int pop() { 13 return data[head --]; //use head’s value then decrement it 14 } 15 public int peek() { 16 return data[head]; 17 } 18 } 12 / 28 Abstract data types There are various things missing from this implementation – what are they? 1 class Stack { 2 private int[] data; 3 private int head; 4 5 public Stack(int n) { 6 data = new Object[n]; 7 head = -1; 8 } 9 public void push(int e) { 10 data [++ head] = e; // increment head then use its value 11 } 12 public int pop() { 13 return data[head --]; //use head’s value then decrement it 14 } 15 public int peek() { 16 return data[head]; 17 } 18 } 13 / 28 Abstract data types Or using a list... 1 class Stack { 2 private LinkedList data; 3 public Stack2 () { 4 data = new LinkedList (); 5 } 6 public void push(int e) { 7 data.cons(e); 8 } 9 public T pop() { 10 int h = data.head(); 11 data = data.tail(); 12 return h; 13 } 14 public int peek() { 15 return data.head(); 16 } 17 } 14 / 28 Abstract data types This brings us to the idea of abstract data types (ADTs). The stack is defined by the ability to push, pop and peek and there are many ways we might implement one. An ADT is a template that defines data and behaviour at an abstract level. This can be achieved using Java generics. 15 / 28 Java generics Java generics allow us to write code that works for many (or any) types. It is what is happening when you see Java code that uses “angle brackets”, such as strs = new ArrayList().

In the docs for the ArrayList class, you will see it described as
ArrayList. That means ArrayList is a container for objects
of any type, which we call T.

When we create an ArrayList we have to say what type of thing
we want to store in it, i.e. what is the type T.

16 / 28

Abstract data types

Using generics we can create a Stack that works for any type:

1 public class Stack {

2 //…

3 public void push(T e) { … }

4 public T pop() { … }

5 public T peek() { … }

6 }

8 //

10 Stack myIntStack = new Stack ();

11 myIntStack.push (42); //OK

12 myIntStack.push(‘‘Hi!’’); //compile -time error

17 / 28

Balancing parens

As a demonstration of the usefulness of stacks, consider the task of
ensuring that all parentheses are nicely balanced in a piece of text.
So “{([()])}” is balanced but “{()” is not, because parens are not
all closed, and neither is “([(]))”, because the nesting is wrong.

We can model this problem with a stack. Every time we encounter
an opening paren character, push it onto a stack.

Every time we encounter a closing paren, pop the stack and check
that the types match.

18 / 28

Balancing parens

{([])}

push(’{’)

head{

19 / 28

Balancing parens

{([])}

push(’(’)

head

{

(

20 / 28

Balancing parens

{([])}

push(’[’)

{

(

head[

21 / 28

Balancing parens

{([])}

pop()==’[’

{

(

head[

22 / 28

Balancing parens

{([])}

pop()==’(’

{

( head

[

23 / 28

Balancing parens

{([])}

pop()==’{’

{

(

head

[

24 / 28

Balancing parens

An unbalanced example.
([(]))

push(’(’)

( head
25 / 28

Balancing parens

An unbalanced example.
([(]))

push(’[’)

(

head[

26 / 28

Balancing parens

An unbalanced example.
([(]))

push(’(’)

(

head

[

(

27 / 28

Balancing parens

An unbalanced example.
([(]))

pop()!=’[’

(

head

[

(

28 / 28

Simple data structures
The stack

__MACOSX/Data structures assignment/._week2c

Data structures assignment/week3c

CI583: Data Structures and Operating Systems

Balanced Trees

1 / 36

Outline

1 Unbalanced trees

2 Red-black trees

3 Rotations

4 Inserting to a red-black tree

5 Deleting from a red-black tree

2 / 36

Unbalanced trees

Last time we saw how powerful and �exible a data structure is the

tree. We saw that binary search trees can provide O(log n)
retrieval, insertion and removal.

However, this is only true so long as the tree remains fairly well

balanced. If we insert sequential data to a tree then the nodes

arrange themselves just like a linked list. Performance degrades

to linear time.

3 / 36

Unbalanced trees

Say we have a tree made up of 10,000 nodes. If the tree is

maximally unbalanced, then the worst-case scenario of searching for

an item is that it takes 10,000 steps. If the tree is completely

balanced (or complete, or full), the worst-case scenario is 14.

4 / 36

Unbalanced trees

Most of the time trees may not be maximally unbalanced but

inputting a run of sequential data may cause it to be partially

unbalanced, or it may have begun with a very small or large root.

In a tree of natural numbers, for instance, if the root is labelled 3

there can be at most two nodes in the left hand sub-tree.

Operations on a tree like this will be somewhere between O(n) and
O(log n).

5 / 36

Self-balancing trees

Self-balancing trees are the solution to this problem. The �rst of

these were AVL Trees, invented by Adelson-Velskii and Landis in

1962.

The idea is that self-balancing tree maintain the invariant that no

path from root to leaf is more than twice as long as any other. To

achieve this, the tree must re-balance itself after insertion and

deletion.

6 / 36

Self-balancing trees

Variations on this idea are used in �le system design, relational

databases, and whenever fast access to a large amount of data is

required.

For instance, relational databases store indices in memory in a

self-balancing tree structure called a BTree or B+ Tree, providing

logarithmic access time with little or no IO.

Linux �le systems such as ext3 store directory listings in a HTree,

which uses a hash function to create a two-level balanced tree of

�les. HTree indexing improved the scalability from a practical limit

of a few thousand �les, into the range of tens of millions of �les per

directory.

7 / 36

Red-black trees

The type of self-balancing tree we will consider in detail is a BST

called the red-black tree. Like the heap we saw in the last lecture,

we de�ne a series of invariants for RB-trees and make sure that

they will all still hold after each operation.

Red-black trees were invented in 1972 by Bayer.

8 / 36

Red-black trees

The invariants on RB-trees:

1 Each node is either red or black (think of this �colour� as an

extra bit � we could use 1 or 0 or any other choice).

2 The root and leaves are black.

3 If a node is red, its children must be black.

4 For each node, x, every path from x to a leaf contains the
same number of black nodes.

The motivation for these conditions is probably mysterious to you,

but we will see that maintaining them results in a balanced tree and

gives us the logarithmic performance we want.

9 / 36

Red-black trees

An example. The small �lled black circles represent null pointers in

the leaves (not normally depicted). These are always black. We

won’t normally show them but this is what we mean when we say

the �leaves� are black.

183

26261

10 / 36

Red-black trees

The black-height of a node x is the number of black nodes on a
path from x to a leaf, not including x. So we can state property 4
in terms of black-height.

183

26261

11 / 36

Red-black trees

Because we include the null pointers a red-black tree is a branching

BST � every node has 2 or 0 children. The properties force a

red-black tree with n nodes to have O(log n) height. Actually, the
height will be 2(log n + 1) � see (Cormen 2009, p309) for a proof.

Queries (e.g. search, �nd the minimum or maximum element etc)

will require a visit to every level at worst, giving us O(h) or
O(log n) time. Updates (insertion and deletion) are more tricky.

12 / 36

Rotations

Before we can describe how to update a red-black tree, we need to

understand rotations. A rotation is a local change to the structure

of the tree that preserves the RB properties.

x y

left-rotate(x)

right-rotate(y)

13 / 36

Rotations

The left-rotation pivots around the link from x to y. When we
rotate in either direction we assume that x and y are not nil (i.e.
they are real, internal nodes). The α, β and γ components might
be nil or might be actual subtrees. Either way, they are properly

balanced RB trees.

x y

left-rotate(x)

right-rotate(y)

14 / 36

Rotations

We can easily see that rotations preserve the BST property: The

keys in α are less than the key of x, which is less than the key of y,
and so on.

x y

left-rotate(x)

right-rotate(y)

15 / 36

Rotations

An example within a BST.

114

3 6

left-rotate(T, x)

16 / 36

Rotations

An example within a BST.

193 6

x y

17 / 36

Rotations

Rotations take constant time since they only involve switching

some pointers around. Recolouring is, of course, also done in

constant time.

We will see that these two techniques are all we need to maintain

the properties in a red-black tree.

18 / 36

Inserting to a red-black tree

We insert an element, x, to a red-black tree, T, as follows:

1 Insert x as if T were an ordinary BST � see last lecture. This
step may break the RB properties.

2 Colour x red.

3 Restore the RB properties by recolouring and rotating.

After restoring the RB properties we know that the new tree, T ′, is
balanced (by the proof in Cormen mentioned before).

19 / 36

Inserting to a red-black tree

Demo

20 / 36

Case 1: Recolouring

We can recolour a node whenever doing so does not change the

black-heights of the tree. This occurs when the parent and uncle

(other child of the grandparent) of the node are both red.

A D

Recolouring moves the problem up the tree. A is shown with only
one child because it doesn’t matter if B is the right or left child.

21 / 36

Case 2: left rotation

If we can’t recolour any more, we use rotations. The �rst case is

where z, the violating node, is the right child of its parent. We use
a left rotation to achieve the situation where z is the left child.

z z

22 / 36

Case 3: right rotation and recolouring

Case 2 is followed immediately by case 3, in which we use a right

rotation and recolouring.

A Cz

Note that case 2 falls through into case 3, but case 3 is a case of

its own � i.e. if case 2 is not applicable we may still be able to

apply case 3.

23 / 36

Case 3: right rotation and recolouring

Note that recolouring C is not a problem (will not produce two reds
in a row) because we know that the root of the subtree δ is black �
otherwise we would be in case 1. When there are no longer two red

nodes in a row, the algorithm terminates.

A Cz

24 / 36

Red-black insertion
A complete example

Preparing to insert a value to a red-black tree.

3 18

118

insert 15

25 / 36

Red-black insertion
A complete example

After inserting the new node and colouring it red, we have broken

property 3. Looking at the grandparent of the new node, we have a

candidate for recolouring.

3 18

118

26 / 36

Red-black insertion
A complete example

Now the violation has moved further up the tree and we can’t do

any more recolouring. The violating node is the left child of its

parent, so use right rotation.

3 18

118

27 / 36

Red-black insertion
A complete example

We have straightened out the dog-leg. now the violating node is

the right child of its parent. Rotate the left.

3 10

28 / 36

Red-black insertion
A complete example

Recolour the root and we are done.

118

29 / 36

Red-black insertion

The pseudocode for insertion to a red-black tree is quite easy to

follow but a bit too long to go on a slide, simply because there are

a lot of cases to consider. Again, see Cormen for an example.

30 / 36

Red-black deletion

Similarly to insertion, we delete from a red-black tree just as we

would from a BST, then call a ��xup� routine to repair the RB

properties that might have been broken in the previous step. Again,

properties are �xed by recolouring and rotation.

Deleting a red node cannot violate the RB properties so we only

call the ��xup� routine when the node we removed was black.

31 / 36

Red-black deletion

Let y be a black node removed from a red-black tree and x be the
node that takes its place. What might have been broken by the

removal of y?

1 If y was the root of the tree and x is red, we have violated
property 2.

2 If x and x.p are both red, we have violated property 3.

3 The removal of y means that there is one less black node in
any path through x, so we have de�nitely violated property 4.

To get around the last problem we start by saying that x is either
doubly black (if it was black to start with) or red-black. In this way,

x contributes 2 or 1 to the black-height of any path passing
through it, rather than 1 or 0.

32 / 36

Red-black deletion

We don’t actually change the colour value of x (the node that took
y’s place). We keep track of it just by the fact that x is pointing to
it. The goal of the deletion �xup algorithm is to move this extra

blackness up the tree until:

1 x points to a red-black node, in which case we colour it back.

2 x points to the root, in which case we are done.

3 We apply recolouring and rotations until the properties are

�xed.

33 / 36

Red-black deletion

Thus, whilst x is a non-root doubly black node, we have changes
that need to be made. The cases are as follows:

1 Case 1: x’s sibling, w, is red. Rotate and recolour.

2 Case 2: w is black and both of w’s children are black.
Recolour and move the problem further up the tree.

3 Case 3: w is black, w.left is red and w.right is black.
Recolour and rotate.

4 Case 4: w is black, w.left is red. Recolour and rotate.

Note that these cases aren’t mutually exclusive.

34 / 36

Inserting to a red-black tree

Demo

Note to self: try [10, 34, 48, 79, 83], delete 10.

35 / 36

Next week

An overview of some algorithmic strategies: divide-and-conquer,

greedy algorithms, backtracking, dynamic programming.

36 / 36

Unbalanced trees
Red-black trees
Rotations
Inserting to a red-black tree
Deleting from a red-black tree

__MACOSX/Data structures assignment/._week3c

Data structures assignment/week3b

CI583: Data Structures and Operating Systems

Binary Trees

1 / 40

Binary search trees

Trees start to get interesting when we place some constraints on

their structure. One constraint is that their labels are ordered. If we

do this with a binary tree we get a binary search tree (BST),

de�ned as follows:

1 for each non-leaf node, n, the key of the left child is less than
the key of n and the key of the right child is greater than the
key of n,

2 keys are unique, and

3 the left and right children of n are binary search trees.

2 / 40

Binary search trees

To �nd a key, k, we start at the root, r. If the key of r is less than
k, we take the right sub-tree, if it exists, otherwise we take the left.
If the sub-tree we need does not exist, then k was not found.
Otherwise, we keep following branches until we reach a leaf node,

which either has a key equal to k or k was not found.

3 / 40

Inserting to a BST

Inserting a new key, k, works similarly. We �nd the right place to
put k by following branches until the sub-tree to follow does not
exist, then we attach a new node with k as the label.

4 / 40

Inserting to a BST

7>5

5 / 40

Inserting to a BST

64
7>6

6 / 40

Inserting to a BST

7<9 7 7 / 40 Binary search trees Deleting a node is more tricky. Deleting a node with one or zero branches is easy enough, but to delete a node with two branches we need to merge the branches to produce a new node. (Details in a lab session coming soon!) 3 92 5 1 64 8 / 40 Balanced trees If we insert random data into our trees, they will remain fairly well balanced. That is, the tree is as full as possible, or has the minimum number of missing branches. Then, each pair of left and right sub-trees will contain (approximately) the same number of nodes and the distance from the root to any leaf will be similar. If the input is not random, e.g. is in descending order, the tree will become unbalanced. 3 5 2 4 1 9 / 40 Balanced trees In this case the tree has poor performance characteristics: search, insertion and deletion are all O(n). The family of balanced trees are those where all operations on the tree maintain its balanced structure, requiring quite a lot of rearranging of nodes etc. 3 5 2 4 1 10 / 40 Balanced trees If we can maintain the balance, search trees can be extremely e�cient. If the tree is full then about half of all nodes are leaf nodes. On average, half of all searches will result in the need to traverse the tree all the way to a leaf. In searching, we need to visit one node at each level. So, we can see how many steps a search will take by working out how many levels there are. 11 / 40 Balanced trees Numbers of nodes and levels in a balanced tree: Nodes Levels 15 4 1023 10 32,767 15 1,048,575 20 33,554,432 25 1,073,741,824 30 Thus, we can �nd one of a million unique elements in about 20 steps (sound familiar?). A balanced tree with n nodes has lg(n + 1) levels. 12 / 40 An imperative implementation Implementing trees functionally gives a representation which seems �natural�, but we can also implement them imperatively. We can store the labels in an array, without managing links between them. Every possible node in the tree is represented by a position in the array, whether or not the node exists. The array positions that map to non-existent nodes contain null, or some special value. 13 / 40 An imperative implementation Using this scheme, we can �nd the child and parent nodes of an index, i, using arithmetic: 1 The left child of i is located at 2i + 1. 2 The right child of i is located at 2i + 2. 3 The parent of i is located at b(i −1)/2c. 14 / 40 An imperative implementation Check for yourself that the formulae on the previous slide work. 9 5 64 79 7 6 4 50 6 1 2 ... 15 / 40 Heaps Implementing a tree as an array wastes space and deletion requires us to move every element, so it isn't normally the best choice. It does lend itself to one important application though: heaps. A heap is a binary tree with the following characteristics: 1 It is complete: every level except the last one is full and the last row has no gaps reading from left to right. 2 Each node satis�es the heap condition: its label is greater than or equal to the keys of its children. Note that the invariants of the heap are weaker than that of the BST, but just strong enough to guarantee e�cient insertion and removal. 16 / 40 Heaps Any path through a heap gives an ordered list � descending in our case, but we could have arranged it the other way round. Because a heap is complete, no space is wasted in the array. 9 67 55 6 9 4 7 4 17 / 40 Heaps as priority queues We can use the heap to model a priority queue, where the root is the front of the queue (or has the highest priority). When we remove the element at the front of the queue we need to restore the heap, making sure it is complete and satis�es the heap condition: 1 Remove the root node. 2 Move the last node to the root. The last node is the rightmost node on the lowest level. 3 Trickle down the new root until it's below a node larger than it and above a node less than it, if one exists. 18 / 40 Deleting from a heap When trickling down, at each node we swap places with the largest child. 95 5182 7063 1037 2743 3455 36 remove last node 19 / 40 Deleting from a heap 5182 7063 1037 2743 3455 36 swap 20 / 40 Deleting from a heap 51 82 7063 1037 2743 3455 36 swap 21 / 40 Deleting from a heap 51 82 70 63 1037 2743 3455 36 swap 22 / 40 Deleting from a heap 51 82 70 63 1037 2743 34 55 36 23 / 40 Inserting to a heap Inserting a new value to a heap is even easier: we put the new value in the �rst free position (starting a new level if necessary) and trickle up, swapping places with the parent, until the node is smaller than its parent. Our heap has lg(n + 1) levels, where n is the number of nodes. Insertion and deletion require visiting one node on every level (at worst), so both operations are O(log n). 24 / 40 Heapsort We can use heaps as the basis of an elegant and e�cient sorting algorithm called heapsort. The idea is that we insert the unsorted values into a heap, then repeated applications of remove will give us a sorted collection. 1 for(int i=0;i

In this example:

a is the prefix for the names of the files containing the commands, (the actual names of the
files are “a0.dat”, and “a1.dat”),

2 is the number of processes to be created,

1 is the number of instances to create for the first resource, and

a.log is the name of the output file.

The program will display a window allowing you to run the simulator. You will notice a row
of command buttons across the top, and an informational display below. The left side of the
information display lists the resources and the number of available instances for each, and the
right side lists the processes and the current status for each.

Typically, you will use the step button to execute a cycle of the simulation and observe the
effect on the resources and processes. When you’re done, quit the simulation using the exit
button.

The Command Line

The general form of the command line is

$ java deadlock file-name-prefix initial-number-of-processes initial-available-
for-resource …

where

Parameter Description
file-name-
prefix

Specifies the name prefix for the process command files. The default
command file names are generated from this prefix, followed by the number
of the process, followed by “.dat” (e.g, “a0.dat”, “a1.dat” if “a” is the prefix).
The actual names of the files may be entered or modified in the Processes
Dialog (see below).

initial-number-
of-processes

Specifies the number of processes to create for the simulation. This should
be a non-negative number, usually greater than one. This number may also
be entered or modified using the Options Dialog (see below).

initial-
available-for-
resource…

Specifies the initial number of instances available for each resource. This
should be a sequence of non-negative numbers. For example, “2 1 2 0”
indicates that there are four resources, and there are initially two instances of
resource 0, one instance of resource 1, two instances of resource 2, and zero
instances of resource 3. The number of resources may also be entered or
modified using the Options Dialog (see below). The initial number of
instances available for each resource may be entered or modified using the
Resources Dialog (see below).

The Control Panel

The main control panel for the simulator includes a row of command buttons, and an
informational display.

The buttons:

Button Description
run runs the simulation to completion. Note that the simulation pauses and updates

the screen between each step.
stop stops the simulation if it is running. This button is only active if the run button

has been pressed.
step runs a single setup of the simulation and updates the display.
reset initializes the simulator and starts from the initial values for each process and

resource.
options allows you to change various options for the simulator, including the number of

resources and the number of processes.
resources allows you to change the configuration for each resource, including the initial

and current number of instances available for each resource.
processes allows you to change the configuration for each process, including current state

and the name of the command file for that process.

exit exits the simulation.

The informational display:

Field Description
Time: number of “milliseconds” since the start of the simulation.
Resource
Id:

A number which identifies the particular resource. Resources are numbered
starting with zero.

Resource
Available:

The number of instances available for the particular resource. This is a non-
negative number.

Process Id: A number which identifies the particular process. Processes are numbered
starting with zero.

Process
State:

The current state of the process. This may be U (unknown), C (computable),
W (waiting), or H (halted). At the beginning of the simulation, all processes
have U status. While a process is computable, it has a C status. If it requests a
resource which is unavailable, it enters W status until the resource becomes
available. When a process has completed all the commands in its command
file or performs a halt command, it enters H status.

Process
Resource:

The resource for which this process is waiting, if any. This field only has a
value if the process is in W status.

The Options Dialog Box

The Options Dialog Box allows you to set general options for the simulator.

The options:

Field Description
Number of
Processes:

The number of processes to use in the simulation. This should be a non-
negative number, usually at least two. Although the program does not
enforce a limit, you may not be able to view more than about 10 processes
on the informational display on your display screen. The initial value for
this option is obtained from the second parameter on the command line, or
zero, if not specified. Keep in mind that each process should have a
process command file. To set properties for individual processes, use the
Processes Dialog (see below).

Number of
Resources:

The number of resources available in the simulation. This should be a non-
negative number, usually at least one. Although the program does not
enforce a limit, you may not be able to view more than about 10 resources
on the informational display on your display screen. The initial value for
this option is obtained from the number of initial instances for each
resource specified on the command line (see above), or zero, if none are
specified. This number should be one more than the largest resource
number mentioned in any of the process command files for the simulation.
To set properties for individual resources, use the Resources Dialog (see
below).

Milliseconds
per step:

The number of real-time milliseconds to pause between each cycle of the
simulator in “run” mode. This is the pause between cycles when you hit
the run button. The default value is 1000 milliseconds, or, one second.

The Processes Dialog Box

The Processes Dialog Box allows you to enter or modify properties for each process.

The process properties:

Field Description
Number of
Processes:

The number of processes in the simulation. To change this value, use the
Options Dialog (see above).

Process Id The id number for the process. These numbers are used to identify each
process and are assigned by the simulator, starting with zero. These numbers
cannot be changed.

Process File
Name

The name of the file from which process commands are read. This may be
any valid filename. For convenience, there is a choose button which allows
you to browse the file system to choose the file. By default, the name is the
prefix string, followed by the process number, followed by “.dat”.

The Resources Dialog Box

The Resources Dialog Box allows you to enter and modify properties for each resource.

The resource properties:

Field Description
Number of
resources:

The number of resources available in the simulation. To change this value,
use the Options Dialog (see above).

Resource Id The id number assigned to the resource. This number is used to identify the
resource and is assigned by the simulator and cannot be changed. This is the
number which appears in the R (request resource) and F (free resource)
commands in the process command files.

Resource
Initial

The initial number of available instances of the resource. This number is
used when the simulator starts or is reset.

Resource
Current

The current number of available instances of the resource. This number may
be changed during the simulation to see the effect it may have on processes
waiting for the resource.

The Process Command Files
The process command files for the simulator specifies a sequence operations to be performed
by the process or processes which use the file. There are four operations defined C
(compute), R (request resource), F (free resource) and H (halt).

Operation Description
C msec Compute for the specified number of milliseconds (cycles).
R resource-
id

Request an instance of the specified resource. If none are available, block the
process until the resource becomes available. The resource id should be a non-
negative number less than the total number of resources available.

F resource-
id

Free an instance of the specified resource. This is usually a resource that was
previously requested by the process. The resource id should be a non-negative
number less than the total number of resources available.

H Halt the process. This is usually the last operation in the file. Any commands
which follow it in the file are ignored. Any file that does not end with this
operation is implicitly halted.

Sample Process Command Files

The “a0.dat” input file looks like this:

/*
a0.dat

The “a” collection of process data files is meant to simulate
two processes competing for a single resource. If you run
the simulator with one resource available, one of the processes
will block until the other is done using the resource.
*/
C 10 // compute for 10 milliseconds
R 0 // request resource 0
C 10 // compute for 10 milliseconds
F 0 // free resource 0
H // halt process

Note that the “a1.dat” file is identical. In other words, both files request the same resources at
approximately the same time.

The Output File
The output file contains a log of the simulation since the simulation started.

The output file contains one line per cycle executed. The format of each line is:

time = t available = r0 r1 … rn blocked = n

where
t

is the number of milliseconds since the start of the simulation,
ri

is the number of available instances of each resource, and
n

is the number of blocked processes.

Sample Output

The output file “a.log” looks something like this:

time = 0 available = 1 blocked = 0
time = 1 available = 1 blocked = 0
time = 2 available = 1 blocked = 0
time = 3 available = 1 blocked = 0
time = 4 available = 1 blocked = 0
time = 5 available = 1 blocked = 0
time = 6 available = 1 blocked = 0
time = 7 available = 1 blocked = 0
time = 8 available = 1 blocked = 0
time = 9 available = 1 blocked = 0
time = 10 available = 0 blocked = 1
time = 11 available = 0 blocked = 1
time = 12 available = 0 blocked = 1
time = 13 available = 0 blocked = 1
time = 14 available = 0 blocked = 1
time = 15 available = 0 blocked = 1
time = 16 available = 0 blocked = 1
time = 17 available = 0 blocked = 1

time = 18 available = 0 blocked = 1
time = 19 available = 0 blocked = 1
time = 20 available = 0 blocked = 0
time = 21 available = 0 blocked = 0
time = 22 available = 0 blocked = 0
time = 23 available = 0 blocked = 0
time = 24 available = 0 blocked = 0
time = 25 available = 0 blocked = 0
time = 26 available = 0 blocked = 0
time = 27 available = 0 blocked = 0
time = 28 available = 0 blocked = 0
time = 29 available = 0 blocked = 0
time = 30 available = 1 blocked = 0

In this example, the simulation runs for a total of 30 “milliseconds” and then halts. During the
simulation, all processes are computable for 10 milliseconds. During the next 10
milliseconds, the one instance of the resource is allocated to one process, while the other
process is blocked. During the final 10 milliseconds, the first process frees the resource, but it
is immediately allocated by the second process, which then continues to compute, unblocked,
to the end of the simulation.

Suggested Exercises
1. Try running the deadlock simulator using the following command:

java deadlock a 2 2

Explain why a deadlock does not occur.

2. There are two additional process command files (“b0.dat” and “b1.dat”) in the
distribution. Run the deadlock simulator with this command:

java deadlock b 2 1 1

What happens?

Now try this.

java deadlock b 2 1 2

Why does the first command result in a deadlock but the second does not? Explain your
answer in terms of what is going on in the process command files, b0.dat and b1.dat.

__MACOSX/Data structures assignment/._Deadlock-Simulator

Data structures assignment/week11d

Distributed systems

CI583: Data Structures and Operating Systems
Distributed systems

1 / 21

Distributed systems

Distributed operating systems

The topic of distributed systems is a broad one, and includes
everything from the internet, to a small LAN, to render farms in
which many (otherwise independent) computers collaborate to
complete a computationally intensive task.

2 / 21

Distributed systems

Distributed operating systems

From an OS point of view, we are interested in a particular type of
distributed system – one in which the elements, or nodes, share
one or more of the core OS responsibilities, such as process
management.

This can include clusters and cloud computing, but also more
tightly coupled distributed systems.

3 / 21

Distributed systems

Distributed operating systems
Truly distributed OS

A truly distributed OS is one in which several physically separate
machines each provide part of the functionality of a single OS and,
taken together, the collection of machines provides the image of a
single unit.

The physical location of each OS component (external storage, for
example) is transparent to the user. Many nodes may be assigned
to a particular function (e.g. the render farm example, though
most render farms work at the application level and are not based
on a distributed OS).

4 / 21

Distributed systems

Distributed operating systems

The motivation for this architecture includes:

1 Load balancing.

2 Reliability, redundancy and fault tolerance. Several nodes can
work on the same task, and the first one to complete.

3 Availability.

The danger is that nodes might spend most of their time
communicating with other nodes…

5 / 21

Distributed systems

Distributed operating systems

Within such a system, each machine runs a microkernel that
manages that node’s hardware and handles communication with
other nodes.

Each node also contains one or more system management
components, that provide that node’s functionality.

6 / 21

Distributed systems

Distributed operating systems

A key distinction between such systems is how the distribution is
managed:

1 centralised: all activity is managed by a single master node,

2 decentralised: a tree-like structure where nodes are branches,
and manage the activity of nodes beneath them, or leaves,

3 distributed: each node has one or more links to other nodes
and there is no master. There may be no way for any node to
“see” the entire system.

7 / 21

Distributed systems

Distributed operating systems
Distributed architecture: centralised

Master

Node

Node Node

Node

8 / 21

Distributed systems

Distributed operating systems
Distributed architecture: decentralised

Master

Node

Node Node

Node

9 / 21

Distributed systems

Distributed operating systems
Distributed architecture: clustered

Node

10 / 21

Distributed systems

Distributed operating systems

These tightly-coupled system were the focus of research for
decades from the 1970s onward, but no system ever solved the
problem of efficient communication between nodes.

The OS usually communicates with IO devices like network cards
via a high speed bus, for example. Doing the same thing over a
LAN is far slower.

Since the early 1990s, attention has turned to more loosely-coupled
systems such as clusters – collections of independent machines
each with their own OS.

11 / 21

Distributed systems

Clusters

A cluster is a collection of independent machines, each with their
own traditional OS installed, but which collaborate on some task in
such a tightly-connected way that they can be viewed from the
outside world as a single system.

Clusters of quite cheap machines have been assembled that rival
the world’s fastest and most expensive individual computers.

Using commodity machines means that when a node fails, it is just
thrown out and replaced (though it might just be left in place for a
while and collected with other dead nodes in one sweep).

12 / 21

Distributed systems

Clusters

A cluster provides large-scale parallelism (small-scale being taking
advantage of multiple cores), so that jobs can be broken up into
tasks that are worked on simultaneously.

Most cluster architectures still rely on a master node whose failure
will wipe out the system.

13 / 21

Distributed systems

Clusters

A Google server room containing a single cluster called the
Compute Engine – world’s third fastest supercomputer. You can
hire it for $2m per day. The system it runs is based on GFS and
the MapReduce architecture.

http://research.google.com/archive/mapreduce.html

14 / 21

http://research.google.com/archive/mapreduce.html

Distributed systems

Clusters

The Compute Engine has 96,250 physical machines, with access to
770,000 cores. Each core has access to 3.75GB of RAM.

Figures from http://www.extremetech.com/.

15 / 21

http://www.extremetech.com/

Distributed systems

Clusters

Hadoop is a FOSS system based on a reverse-engineering of
MapReduce. It is used by Facebook, Yahoo!, Amazon and others.

Three problems to solve:

1 Message passing.

2 Task scheduling.

3 Node failure.

16 / 21

Distributed systems

Clusters
Message passing

Obviously, the performance of the bus or network that connects
nodes in the cluster is critical. Most clusters use a specialised
protocol written on top of TCP/IP called MPI. This protocol has
extended strategies for dealing with messages that never arrive, etc.

Especially in large clusters, it is important to minimise the distance
of each node from the master. Hadoop is “rack-aware”, so it
knows the physical location of each node.

17 / 21

Distributed systems

Clusters
Task scheduling

Deciding which tasks should be given to which nodes is an open
problem.

MapReduce has a process on the master node called the
JobTracker, which breaks work down into individual tasks.

18 / 21

Distributed systems

Clusters
Task scheduling

The JobTracker passes these tasks to nodes, trying to minimise
network traffic by choosing nodes on the same rack and as close as
possible to the source of the job. Once processing starts there is
no preemption.

MapReduce uses a strategy of high redundancy – several nodes
might be working on the same item of work and so each item is
guaranteed to be done at least once, rather exactly once.

19 / 21

Distributed systems

Clusters
Node failure

When a node fails or appears to be malfunctioning, that task is
re-executed (unless a high-redundancy strategy is already in place,
meaning that another node is already working on the task).

That node then needs to be isolated from its neighbours – this can
be done by powering off the node or simply making sure that it has
no access to shared resources, such as external storage that other
nodes also make use of.

This is called resource fencing.

20 / 21

Distributed systems

Clusters

Like virtualisation, clustered computing is an idea that has really
taken off.

Rather than running their own server room, a small-to-medium
sized business would now be much better off outsourcing this
problem to a company like Google or Amazon S3.

21 / 21

Distributed systems

Clusters

The company’s computing needs are then served by a cluster
sitting in the cloud.

The nodes in the cluster are probably running VMs.

This takes care of load-balancing, reliability, backup, expansion and
contraction of the network, load on individual nodes and
bandwidth usage.

22 / 21

Distributed systems

__MACOSX/Data structures assignment/._week11d

Data structures assignment/week10b

64-bit systems

CI583: Data Structures and Operating Systems
Memory management part 2

1 / 15

64-bit systems

Page tables

The system we have described so far is a complete page table.

In this system, every page in the addressable memory has an entry
in the page table, even though most of these entries will have the
validity bit set to zero.

Rather than simply holding an entire page table that maps to all
addressable memory, different schemes are used to store page
tables more efficiently.

2 / 15

64-bit systems

Page tables

Such schemes are needed because page tables take up too much
space and have other benefits – for instance, by using an indirect
mapping we can increase the addressable memory.

In various ways, these schemes aim to hold in memory only those
parts of the page table that map to the parts of primary storage
currently being used.

3 / 15

64-bit systems

Forward-mapped page tables

Forward-mapped (or multilevel) page tables break a virtual
memory location down into three parts:

the Level 1 page number (L1),

Level 2 page number (L2), and

the offset.

4 / 15

64-bit systems

Forward-mapped page tables

L1 page no. offset

Virtual memory location

L1 page table

L2 page no.

L2 page tables Page frame

Image adapted from (Doeppner, 2011)

5 / 15

64-bit systems

Forward-mapped page tables

As before, entries in the L1 and L2 tables can be valid or invalid.

Looking up a location means finding the entry in L1, which indexes
a particular L2 page table.

Then the entry in the L2 table is looked up – this points to a page
frame (or is invalid).

6 / 15

64-bit systems

Forward-mapped page tables

The benefit of this system is that not all L2 tables need to be in
memory at any one time, greatly reducing the amount of memory
taken up.

However, translating a virtual memory location now takes two or
three lookups.

7 / 15

64-bit systems

Linear page tables

Linear page tables break up the entire address space into several
(e.g 4) spaces, each with its own page table which is itself held in
virtual memory.

Thus, to look up a virtual memory location, we identify the correct
space, then look up the page number in the corresponding page
table, then finally retrieve the location of the page frame and use
the offset to get the right location.

This usually requires fewer memory accesses than forward-mapped
tables, because it takes advantage of the fact that the most
common pattern of memory access within processes is contiguous.

8 / 15

64-bit systems

Hashed page tables

Hashed page tables take a different approach to the ones we’ve
seen so far.

The page table is loaded with translation information for only those
parts of the address space which are in use (i.e. no invalid entries).

We access the page table using a function which hashes the virtual
memory location.

9 / 15

64-bit systems

Hashed page tables

Each entry in the page table contains, as well as the page frame
number, the original page number and a link to the next entry with
the same hash value.

Thus, looking up a page table entry might mean looking up the
entry based on the hash, then following one or more links in the
table to find the real entry.

10 / 15

64-bit systems

Hashed page tables

Page no. offset

Virtual memory location

Page table Page frame

f(#)

TAG

ENTRY

NEXTNEXT

TAG

ENTRY

TAG

ENTRY

Image adapted from (Doeppner, 2011)

11 / 15

64-bit systems

Hashed page tables

This approach is particularly efficient when dealing with small
regions of allocated space in a larger, sparsely-populated data
region.

However, the page tables become vary large, since each entry
requires three words. This problem is solved by using clustered
page tables.

Using this approach, multiple pages are grouped together into
superpages. Which page we actually require can be inferred from
the result of the hash function.

12 / 15

64-bit systems

Hashed page tables

Page no. offset

Virtual memory location

Page table Page frame

f(#)

TAG

ENTRY

TAG

Image adapted from (Doeppner, 2011)
13 / 15

64-bit systems

Page tables in 64-bit systems

64-bit systems provide 264 bits of addressable space.

A complete page table would occupy 16 petabytes (as opposed to
4MB for a 32-bit system)!

14 / 15

64-bit systems

Page tables in 64-bit systems

The other 32-bit solutions are also “worse” in this setting – e.g. a
forward-mapped page table with 4KB pages would require 4 to 7
lookups per translation.

The x64 architecture uses forward-mapped page tables with four
levels of 4KB pages and a subsequent 3 levels of 2MB pages.

15 / 15

64-bit systems

__MACOSX/Data structures assignment/._week10b

Data structures assignment/week10c

Virtual memory in practice Fetch policies

CI583: Data Structures and Operating Systems
Memory management part 3

1 / 12

Virtual memory in practice Fetch policies

Outline

1 Virtual memory in practice

2 Fetch policies

2 / 12

Virtual memory in practice Fetch policies

Virtual memory in practice

The aim of virtual memory is to allow the OS to address a larger
amount of memory than the available primary storage.

The job of the OS is to make sure that using the (expensive)
techniques of virtual memory can be done as efficiently as possible.

3 / 12

Virtual memory in practice Fetch policies

Virtual memory in practice

Image ©http: // www. cs. odu. edu/ ~ cs471w

4 / 12

http://www.cs.odu.edu/~cs471w

Virtual memory in practice Fetch policies

Virtual memory in practice

Consider the actions required when a page fault occurs:

1 Hardware address translation facility raises an interrupt.

2 Find a free page frame.

3 If there are no free frames, decide which existing frame to
reuse.

4 If the frame we’re reusing contains a modified page, write it
out to secondary storage.

5 Fetch the required page from secondary storage and modify
the page table.

6 Return from interrupt.

5 / 12

Virtual memory in practice Fetch policies

Virtual memory in practice

This is an expensive process, especially the IO involved, which
could take milliseconds to complete.

We have seen how to map virtual memory locations to page frames
in primary storage, and we have seen how to move pages from
secondary to primary storage when a page fault occurs.

Now we need to know which pages to hold in primary storage and
which to get rid of.

6 / 12

Virtual memory in practice Fetch policies

Virtual memory in practice

We can break this down into three areas:

1 The fetch policy: when to bring pages in from secondary
storage and which ones to bring.

2 The placement policy: where to put the pages in primary
storage (i.e. how to allocate page frames).

3 The replacement policy: which pages to remove from primary
storage and when to do it.

7 / 12

Virtual memory in practice Fetch policies

Fetch policies

Probably the simplest fetch policy we can imagine is demand
paging: fetch a page (only) when a thread references a location in
a page that is not in primary storage.

Thus, the execution of a program, P, begins by loading a single
page that contains the initial instructions for P. Each subsequent
page is retrieved as needed.

If the cost of fetching n pages is n× the cost of fetching one page,
this is the best we can do! It isn’t though – why not?

8 / 12

Virtual memory in practice Fetch policies

Fetch policies

As we have said, we want to minimise the IO in all of this and
reduce the number of faults that occur.

One way to do this is by prepaging: fetching pages before they are
actually required (i.e. without having to go through a whole page
fault in order to get it).

9 / 12

Virtual memory in practice Fetch policies

Fetch policies

How do we know which pages a process will require, before it
actually requires them?

Most of the time there is no way to know this, but it is a fair bet
that if a process requests a given page, it might well request its
subsequent pages next.

Fetching 2 or 3 pages in one go is not much more expensive than
fetching one, since the file system, disk controller etc are already
engaged. This is called readahead.

10 / 12

Virtual memory in practice Fetch policies

Fetch policies

Another way in which the fetch policy can have a big impact is in
choosing a page size that reflects the type of files in the system
and the way they are accessed.

For instance the Google File System (GFS) is a distributed file
system that makes a number of design decisions based on the
assumption that a file which is less than 10GB in size is a relatively
small one.

http://research.google.com/archive/gfs.html

11 / 12

http://research.google.com/archive/gfs.html

Virtual memory in practice Fetch policies

Next time

Policies for placement and replacement.

12 / 12

Virtual memory in practice
Fetch policies

__MACOSX/Data structures assignment/._week10c

Data structures assignment/filesys.zip

filesys/BitBlock.class
public synchronized class BitBlock extends Block {
public void BitBlock(short);
public void setBit(int);
public void setBit(int, boolean);
public boolean isBitSet(int);
public void resetBit(int);
}

filesys/BitBlock.java

public

class

BitBlock

extends

Block

{

/**

* Construct a bit block of the specified size in bytes.

*
@param
blockSize the size of the block in bytes

public

BitBlock
(

short
blockSize
)

{

super
(
blockSize
)

;

}

/**

* Set a specified bit to 1 (true).

*
@param
whichBit the bit to set

public

void
setBit
(

int
whichBit
)

{

bytes
[
whichBit
/
8
]

|=

(
byte
)(

1

<< ( whichBit % 8 ) ) ; } /** * Set a specifed bit to a specified boolean value. * @param whichBit the bit to set * @param value the value to which the bit should be set */ public void setBit ( int whichBit , boolean value ) { if ( value ) setBit ( whichBit ) ; else resetBit ( whichBit ) ; } /** * Checks to see if the specified bit of the block is set (1) or * reset (0). * @param whichBit the bit to check. * @return true if set; false if reset. */ public boolean isBitSet ( int whichBit ) { return ( bytes [ whichBit / 8 ] & ( byte )( 1 << ( whichBit % 8 ) ) ) != 0 ; } /** * Sets the specified bit of the block to 0 (false). * @param whichBit bit to set to 0 (false). */ public void resetBit ( int whichBit ) { bytes [ whichBit / 8 ] &= ~ ( byte )( 1 << ( whichBit % 8 ) ) ; } } __MACOSX/filesys/._BitBlock.java filesys/Block.class public synchronized class Block { private short blockSize; public byte[] bytes; public void Block(); public void Block(short); public void setBlockSize(short); public short getBlockSize(); public void read(java.io.RandomAccessFile) throws java.io.IOException, java.io.EOFException; public void write(java.io.RandomAccessFile) throws java.io.IOException; } filesys/Block.java filesys/Block.java import java . io . RandomAccessFile ; import java . io . IOException ; import java . io . EOFException ; /** * An array of bytes. */ public class Block { /** * The block size in bytes for this Block. */ private short blockSize = 0 ; /** * The array of bytes for this block. */ public byte [] bytes = null ; /** * Construct a block. */ public Block () { super () ; } /** * Construct a block with a given block size. * @param blockSize the block size in bytes */ public Block ( short blockSize ) { super () ; setBlockSize ( blockSize ) ; } /** * Set the block size in bytes for this Block. * @param newBlockSize the new block size in bytes */ public void setBlockSize ( short newBlockSize ) { blockSize = newBlockSize ; bytes = new byte [ blockSize ] ; } /** * Get the block size in bytes for this Block. * @return the block size in bytes */ public short getBlockSize ( ) { return blockSize ; } /** * Read a block from a file at the current position. * @param file the random access file from which to read * @exception java.io.EOFException if attempt to read past end of file * @exception java.io.IOException if an I/O error occurs */ public void read ( RandomAccessFile file ) throws IOException , EOFException { file . readFully ( bytes ) ; } /** * Write a block to a file at the current position. * @param file the random access file to which to write * @exception java.io.IOException if an I/O error occurs */ public void write ( RandomAccessFile file ) throws IOException { file . write ( bytes ) ; } } __MACOSX/filesys/._Block.java filesys/cat.class public synchronized class cat { public static final String PROGRAM_NAME = cat; public static final int BUF_SIZE = 4096; public void cat(); public static void main(String[]) throws Exception; } filesys/cat.java filesys/cat.java /** * Reads a sequence of files and writes them to standard output. * A simple cat program for a simulated file system. *

* Usage:


 *   java cat input-file ...

 *

public

class
cat

{

/**

* The name of this program.

* This is the program name that is used

* when displaying error messages.

public

static

final

String
PROGRAM_NAME
=

“cat”

;

/**

* The size of the buffer to be used for reading from the

* file. A buffer of this size is filled before writing

* to the output file.

public

static

final

int
BUF_SIZE
=

4096

;

/**

* Reads files and writes to standard output.

*
@exception
java.lang.Exception if an exception is thrown

* by an underlying operation

public

static

void
main
(

String
[]
argv
)

throws

Exception

{

// initialize the file system simulator kernel

Kernel
.
initialize
()

;

// display a helpful message if no arguments are given

if
(
argv
.
length
==

0

)

{

System
.
err
.
println
(
PROGRAM_NAME
+

“: usage: java ”

+
PROGRAM_NAME
+

” input-file …”

)

;

Kernel
.
exit
(

1

)

;

}

// for each filename specified

for
(

int
i
=

0

;
i
< argv . length ; i ++ ) { String name = argv [ i ] ; // open the file for reading int in_fd = Kernel . open ( name , Kernel . O_RDONLY ) ; if ( in_fd < 0 ) { Kernel . perror ( PROGRAM_NAME ) ; System . err . println ( PROGRAM_NAME + ": unable to open input file \"" + name + "\"" ) ; Kernel . exit ( 2 ) ; } // create a buffer for reading data byte [] buffer = new byte [ BUF_SIZE ] ; // read data while we can int rd_count ; while ( true ) { // read a buffer full of data rd_count = Kernel . read ( in_fd , buffer , BUF_SIZE ) ; // if we encounter an error or get to the end, quit the loop if ( rd_count <= 0 ) break ; // write whatever we read to standard output System . out . write ( buffer , 0 , rd_count ) ; } // close the input file Kernel . close ( in_fd ) ; // exit with failure if we encounter an error if ( rd_count < 0 ) { Kernel . perror ( PROGRAM_NAME ) ; System . err . println ( PROGRAM_NAME + ": error during read from input file" ) ; Kernel . exit ( 3 ) ; } } // exit with success if we read all the files without error Kernel . exit ( 0 ) ; } } __MACOSX/filesys/._cat.java filesys/cp.class public synchronized class cp { public static final String PROGRAM_NAME = cp; public static final int BUF_SIZE = 4096; public static final short OUTPUT_MODE = 448; public void cp(); public static void main(String[]) throws Exception; } filesys/cp.java filesys/cp.java /** * A simple copy program for a simulated file system. *

* Usage:


 *   java cp input-file output-file

 *

public

class
cp

{

/**

* The name of this program.

* This is the program name that is used

* when displaying error messages.

public

static

final

String
PROGRAM_NAME
=

“cp”

;

/**

* The size of the buffer to be used when reading files.

public

static

final

int
BUF_SIZE
=

4096

;

/**

* The file mode to use when creating the output file.

// ??? perhaps this should be the same mode as the input file

public

static

final

short
OUTPUT_MODE
=

0700

;

/**

* Copies an input file to an output file.

*
@exception
java.lang.Exception if an exception is thrown by

* an underlying operation

public

static

void
main
(

String
[]
argv
)

throws

Exception

{

// initialize the file system simulator kernel

Kernel
.
initialize
()

;

// make sure we got the correct number of parameters

if
(
argv
.
length
!=

2

)

{

System
.
err
.
println
(
PROGRAM_NAME
+

“: usage: java ”

+

PROGRAM_NAME
+

” input-file output-file”

)

;

Kernel
.
exit
(

1

)

;

}

// give the parameters more meaningful names

String
in_name
=
argv
[
0
]

;

String
out_name
=
argv
[
1
]

;

// open the input file

int
in_fd
=

Kernel
.
open
(
in_name
,

Kernel
.
O_RDONLY
)

;

if
(
in_fd
< 0 ) { Kernel . perror ( PROGRAM_NAME ) ; System . err . println ( PROGRAM_NAME + ": unable to open input file \"" + in_name + "\"" ) ; Kernel . exit ( 2 ) ; } // open the output file int out_fd = Kernel . creat ( out_name , OUTPUT_MODE ) ; if ( out_fd < 0 ) { Kernel . perror ( PROGRAM_NAME ) ; System . err . println ( PROGRAM_NAME + ": unable to open output file \"" + argv [ 1 ] + "\"" ) ; Kernel . exit ( 3 ) ; } // read and write buffers full of data while we can int rd_count ; byte [] buffer = new byte [ BUF_SIZE ] ; while ( true ) { // read a buffer full from the input file rd_count = Kernel . read ( in_fd , buffer , BUF_SIZE ) ; // if error or nothing read, quit the loop if ( rd_count <= 0 ) break ; // write whatever we read to the output file int wr_count = Kernel . write ( out_fd , buffer , rd_count ) ; // if error or nothing written, give message and exit if ( wr_count <= 0 ) { Kernel . perror ( PROGRAM_NAME ) ; System . err . println ( PROGRAM_NAME + ": error during write to output file" ) ; Kernel . exit ( 4 ) ; } } // close the files Kernel . close ( in_fd ) ; Kernel . close ( out_fd ) ; // check to see if the final read was successful; exit accordingly if ( rd_count == 0 ) Kernel . exit ( 0 ) ; else { Kernel . perror ( PROGRAM_NAME ) ; System . err . println ( PROGRAM_NAME + ": error during read from input file" ) ; Kernel . exit ( 5 ) ; } } } __MACOSX/filesys/._cp.java filesys/DirectoryEntry.class public synchronized class DirectoryEntry { public static final int MAX_FILENAME_LENGTH = 14; public static final int DIRECTORY_ENTRY_SIZE = 16; public short d_ino; public byte[] d_name; public void DirectoryEntry(); public void DirectoryEntry(short, String); public void setIno(short); public short getIno(); public void setName(String); public String getName(); public void write(byte[], int); public void read(byte[], int); public String toString(); public static void main(String[]) throws Exception; } filesys/DirectoryEntry.java filesys/DirectoryEntry.java /** * A directory entry for a simulated file system. */ public class DirectoryEntry { /** * Maximum length of a file name. */ public static final int MAX_FILENAME_LENGTH = 14 ; /** * Size of a directory entry (on disk) in bytes. */ public static final int DIRECTORY_ENTRY_SIZE = MAX_FILENAME_LENGTH + 2 ; /** * i-node number for this DirectoryEntry */ public short d_ino = 0 ; /** * file name for this DirectoryEntry */ public byte [] d_name = new byte [ MAX_FILENAME_LENGTH ] ; /** * Constructs an empty DirectoryEntry. */ public DirectoryEntry () { super () ; } /** * Constructs a DirectoryEntry for the given inode and name. * Note that the name is stored internally as a byte[], * not as a string. * @param ino the inode number for this DirectoryEntry * @param name the file name for this DirectoryEntry */ public DirectoryEntry ( short ino , String name ) { super () ; setIno ( ino ); setName ( name ); } /** * Sets the inode number for this DirectoryEntry * @param newIno the new inode number */ public void setIno ( short newIno ) { d_ino = newIno ; } /** * Gets the inode number for this DirectoryEntry * @return the inode number */ public short getIno () { return d_ino ; } /** * Sets the name for this DirectoryEntry * @param newName the new name */ public void setName ( String newName ) { for ( int i = 0 ; i < MAX_FILENAME_LENGTH && i < newName . length () ; i ++ ) if ( i < newName . length () ) d_name [ i ] = ( byte ) newName . charAt ( i ) ; else d_name [ i ] = ( byte ) 0 ; } /** * Gets the name for this DirectoryEntry * @return the name */ public String getName () { StringBuffer s = new StringBuffer ( MAX_FILENAME_LENGTH ) ; for ( int i = 0 ; i < MAX_FILENAME_LENGTH ; i ++ ) { if ( d_name [ i ] == ( byte ) 0 ) break ; s . append ( ( char ) d_name [ i ] ) ; } return s . toString () ; } /** * Writes a DirectoryEntry to the specified byte array at the specified * offset. * @param buffer the byte array to which the directory entry should be written * @param offset the offset from the beginning of the buffer to which the * directory entry should be written */ public void write ( byte [] buffer , int offset ) { buffer [ offset ] = ( byte )( d_ino >>>

8

);

buffer
[
offset
+
1
]

=

(
byte
)
d_ino
;

for
(

int
i
=

0

;
i
< d_name . length ; i ++ ) buffer [ offset + 2 + i ] = d_name [ i ] ; } /** * Reads a DirectoryEntry from the spcified byte array at the specified * offset. * @param buffer the byte array from which the directory entry should be read * @param offset the offset from the beginning of the buffer from which the * directory entry should be read */ public void read ( byte [] buffer , int offset ) { int hi = buffer [ offset ] & 0xff ; int lo = buffer [ offset + 1 ] & 0xff ; d_ino = ( short )( hi << 8 | lo ) ; for ( int i = 0 ; i < d_name . length ; i ++ ) d_name [ i ] = buffer [ offset + 2 + i ] ; } /** * Converts a DirectoryEntry to a printable string. * @return the printable string */ public String toString () { StringBuffer s = new StringBuffer ( "DirectoryEntry[" ) ; s . append ( getIno () ) ; s . append ( ',' ) ; s . append ( getName () ) ; s . append ( ']' ) ; return s . toString () ; } /** * A test driver for this class. * @exception java.lang.Exception any exception which may occur. */ public static void main ( String [] args ) throws Exception { DirectoryEntry root = new DirectoryEntry ( ( short ) 1 , "/" ) ; System . out . println ( root . toString () ) ; } } __MACOSX/filesys/._DirectoryEntry.java filesys/dump.class public synchronized class dump { public void dump(); public static void main(String[]); } filesys/dump.java filesys/dump.java import java . io . * ; import java . lang . Integer ; /** * a simple dump program * prints the offset, hexvalue, and decimal value for each byte in a * file, for all files mentioned on the command line. *

* Usage:


*   java dump input-file

*

public

class
dump

{

public

static

void
main
(

String
[]
args
)

{

for

(

int
i
=

0

;
i
< args . length ; i ++ ) { // open a file try { FileInputStream ifile = new FileInputStream ( args [ i ] ) ; BufferedInputStream in = new BufferedInputStream ( ifile ) ; // while we are able to read bytes from it int c ; for ( int j = 0 ; ( c = in . read () ) != - 1 ; j ++ ) { if ( c >

0

)

{

System
.
out
.
print
(
j
+

” ”

+

Integer
.
toHexString
(
c
)

+

” ”

+
c
)

;

if
(
c
>=

32

&&
c
< 127 ) System . out . print ( " " + ( char ) c ) ; System . out . println () ; } } in . close () ; } catch ( FileNotFoundException e ) { System . out . println ( "error: unable to open input file " + args [ i ] ) ; } catch ( IOException e ) { System . out . println ( "error: unable to read from file " + args [ i ] ) ; } } } } __MACOSX/filesys/._dump.java filesys/FileDescriptor.class public synchronized class FileDescriptor { private FileSystem fileSystem; private IndexNode indexNode; private short deviceNumber; private short indexNodeNumber; private int flags; private int offset; private byte[] bytes; void FileDescriptor(short, short, int) throws java.io.IOException; void FileDescriptor(FileSystem, IndexNode, int); public void setDeviceNumber(short); public short getDeviceNumber(); public IndexNode getIndexNode(); public void setIndexNodeNumber(short); public short getIndexNodeNumber(); public int getFlags(); public byte[] getBytes(); public short getMode(); public int getSize(); public void setSize(int) throws java.io.IOException; public short getBlockSize(); public int getOffset(); public void setOffset(int); public int readBlock(short) throws Exception; public int writeBlock(short) throws Exception; } filesys/FileDescriptor.java filesys/FileDescriptor.java import java . io . IOException ; /** * A file descriptor for an open file in a simulated file system. */ public class FileDescriptor { private FileSystem fileSystem = null ; private IndexNode indexNode = null ; private short deviceNumber = - 1 ; private short indexNodeNumber = - 1 ; private int flags = 0 ; private int offset = 0 ; private byte [] bytes = null ; FileDescriptor ( short newDeviceNumber , short newIndexNodeNumber , int newFlags ) throws IOException { super () ; deviceNumber = newDeviceNumber ; indexNodeNumber = newIndexNodeNumber ; flags = newFlags ; fileSystem = Kernel . openFileSystems [ deviceNumber ] ; indexNode = new IndexNode () ; fileSystem . readIndexNode ( indexNode , indexNodeNumber ) ; bytes = new byte [ fileSystem . getBlockSize ()] ; } FileDescriptor ( FileSystem newFileSystem , IndexNode newIndexNode , int newFlags ) { super () ; fileSystem = newFileSystem ; indexNode = newIndexNode ; flags = newFlags ; bytes = new byte [ fileSystem . getBlockSize ()] ; } public void setDeviceNumber ( short newDeviceNumber ) { deviceNumber = newDeviceNumber ; } public short getDeviceNumber () { return deviceNumber ; } public IndexNode getIndexNode () { return indexNode ; } public void setIndexNodeNumber ( short newIndexNodeNumber ) { indexNodeNumber = newIndexNodeNumber ; } public short getIndexNodeNumber () { return indexNodeNumber ; } public int getFlags () { return flags ; } public byte [] getBytes () { return bytes ; } public short getMode () { return indexNode . getMode () ; } public int getSize () { return indexNode . getSize () ; } public void setSize ( int newSize ) throws IOException { indexNode . setSize ( newSize ) ; // write the inode fileSystem . writeIndexNode ( indexNode , indexNodeNumber ) ; } public short getBlockSize () { return fileSystem . getBlockSize () ; } public int getOffset () { return offset ; } public void setOffset ( int newOffset ) { offset = newOffset ; } public int readBlock ( short relativeBlockNumber ) throws Exception { if ( relativeBlockNumber >=

IndexNode
.
MAX_FILE_BLOCKS
)

{

Kernel
.
setErrno
(

Kernel
.
EFBIG
)

;

return

–
1

;

}

// ask the IndexNode for the actual block number

// given the relative block number

int
blockOffset
=

indexNode
.
getBlockAddress
(
relativeBlockNumber
)

;

if
(
blockOffset
==

FileSystem
.
NOT_A_BLOCK
)

{

// clear the bytes if it’s a block that was never written

int
blockSize
=
fileSystem
.
getBlockSize
()

;

for
(

int
i
=

0

;
i
< blockSize ; i ++ ) bytes [ i ] = ( byte ) 0 ; } else { // read the actual block into bytes fileSystem . read ( bytes , fileSystem . getDataBlockOffset () + blockOffset ) ; } return 0 ; } public int writeBlock ( short relativeBlockNumber ) throws Exception { if ( relativeBlockNumber >=

IndexNode
.
MAX_FILE_BLOCKS
)

{

Kernel
.
setErrno
(

Kernel
.
EFBIG
)

;

return

–
1

;

}

// ask the IndexNode for the actual block number

// given the relative block number

int
blockOffset
=

indexNode
.
getBlockAddress
(
relativeBlockNumber
)

;

if
(
blockOffset
==

FileSystem
.
NOT_A_BLOCK
)

{

// allocate a block; quit if we can’t

blockOffset
=
fileSystem
.
allocateBlock
()

;

Overview

The MOSS File System Simulator is a collection of Java classes
which simulate the file system calls available in a typical
Unix-like operating system. The “Kernel” class contains
methods (functions) like “creat()”, “open()”, “read()”,
“write()”, “close()”, etc., which read and write blocks
in an underlying file in much the same way that a real
file system would read and write blocks on an underlying
disk device.

In addition to the “Kernel” class, there are a number of
underlying classes to support the implementation of the kernel.
The classes FileSystem, IndexNode, DirectoryEntry, SuperBlock,
Block, BitBlock, FileDescriptor, and Stat contain all data
structures and algorithms which implement the simulated
file system.

Also included are a number of sample programs which can
be used to operate on a simulated file system. The Java
programs “ls”, “cat”, “mkdir”, “mkfs”, etc., perform
file system operations to list directories, display files,
create directories, and create (initialize) file systems.
These programs illustrate the various file system calls
and allow the user to carry out various read and write
operations on the simulated file system.

As mentioned above, there is a backing file for our simulated
file system. A “dump” program is included with the distribution
so that you can examine this file, byte-by-byte. Any dump
program may be used (e.g., the “od” program in Unix); we include
this one which is simple to use and understand, and can be
used with any operating system.

There are a number of ways you can use the simulator to
get a better understanding of file systems. You can

use the provided utility programs
(mkfs, mkdir, ls, cat, etc.) to
perform operations on the simulated file system and use
the dump program to examine the underlying file and observe
any changes,

examine the sample utility programs to see how they use the system
call interface to perform file operations,

enhance the sample utility programs to provide additional
functionality,

write your own utility programs to extend the functionality of the
simulated file system, and

modify the underlying Kernel and other implementation classes
to extend the functionality of the

In the sections which follow, you will learn what you need to
know to perform each of these activities.

Using File System Simulator Programs

Using mkfs

The mkfs program creates a file system backing file.
It does this by creating a file whose size is specified by the
block size and number of blocks given. It writes the superblock,
the free list blocks, the inode blocks, and the data blocks
for a new file system. Note that it will overwrite any existing
file of the name specified, so be careful when you use this program.

This program is similar to the “mkfs” program found in
Unix-like operating systems.

The general format for the mkfs command is

java mkfs file-name block-size blocks

where
file-name

is the name of the backing file to create (e.g., filesys.dat).
Note that this is the name of a real file, not a file in simulator.
This is the file that the simulator uses to simulate the disk device
for the simulated file system.
This may be any valid file name in your operating system environment.

block-size

is the block size to be used for the file system (e.g., 256).
This should be a multiple of the index node (i-node) size (usually 64)
and the directory entry size (usually 16). Modern operating systems
usually use a size of 1024, or 512 bytes. We use 128 or 256 byte block
sizes in many of our examples so that you can quickly see what happens
when directories grow beyond one block. This should be a decimal
number not less than 64, but less than 32768.

blocks

is the number of blocks to create in the file system(e.g., 40).
This number includes any blocks that may be used for the superblock,
free list management, inodes, and data blocks. We use a relatively small
number here so that you can quickly see what happens if you run out of
disk space. This can be any decimal number greater than 3, but not greater
than 224 – 1 (the maximum number of blocks), although you may not
have sufficient space to create a very large file.

For example, the command
java mkfs filesys.dat 256 40

will create (or overwrite) a file “filesys.dat” so that it contains
40 256-byte blocks for a total of 10240 bytes.

The output from the command should look something like this:

block_size: 256
blocks: 40
super_blocks: 1
free_list_blocks: 1
inode_blocks: 8
data_blocks: 30
block_total: 40

From the output you can see that
one block is needed for the superblock, one for
free list management, eight for index nodes, and the remaining
30 are available for data blocks.

Why is there 1 block for free list management? Note that 30 blocks
require 30 bits in the free list bitmap. Since
256 bytes/block * 8 bits/byte = 2048 bits/block, clearly
one bitmap block is sufficient to track block allocation
for this file system.

Why are there 8 blocks for index nodes? Note that 30 blocks could
result in 30 inodes if many one-block files or directories are created.
Since each inode requires 64 bytes, only 4 will fit in a block.
Therefore, 8 blocks are set aside for up to 32 inodes.

Using mkdir

The mkdir program can be used to create new
directories in our simulated file system. It does this
by creating the file specified as a directory file, and
then writing the directory entries for “.” and “..” to the
newly created file. Note that all directories leading
to the new directory must already exist.

This program is similar to the “mkdir” command in Unix-like and
MS-DOS-related operating systems.

The general format for the mkdir command is

java mkdir directory-path

where
directory-path

is the path of the directory to be created (e.g., “/root”, or
“temp”, or “../home/rayo/moss/filesys”). If directory-path
does not begin with a “/”, then it is appended to the
path name for working directory for the default process.

For example, the command
java mkdir /home

creates a directory called “home” as a subdirectory of the root
directory of the file system.

Similarly, the command

java mkdir /home/rayo

creates a directory called “rayo” as a subdirectory of the
“home” directory, which is presumed to already exist as a
subdirectory of the root directory of the file system.

Using ls

The ls program is used to list information
about files and directories in our simulated file system.
For each file or directory name given it displays information
about the files named, or in the case of directories, for
each file in the directories named.

This program is similar to
the “ls” command in Unix-like operating systems, or the “dir”
command in DOS-related operating systems.

The general format for the ls command is

java ls path-name …

where
path-name …

is a space-separated list of one or more file or
directory path names.

For example, the command
java ls /home

lists the contents of the “/home” directory. For each file
in the directory, a line is printed showing the name of the
file or subdirectory, and other pertinent information such
as size.
The output from the command should look something like this:

/home:
1 48 .
0 48 ..
2 32 rayo
total files: 3

In this case we see that the “/home” directory contains
entries for “.”, “..”, and “rayo”.

Using tee

The tee program reads from standard input and
writes whatever is read to both standard output and
the named file. You can use this program to create
files in our simulated file system with content created
in the operating system environment.
This program is similar to the “tee” command found in
many Unix-like operating systems.

The general format for the tee command is

java tee file-path

where
file-path

is the name of a file to be created in the simulated
file system. If the named file already exists, it will
be overwritten.

For example,
echo “howdy, podner” | java tee /home/rayo/hello.txt

causes the single line “howdy, podner” to be written
to the file “/home/rayo/hello.txt”.
The output from the command is

howdy, podner

which you should note was the same as the input sent
to the tee program by the “echo” command.
Note that the “|” (pipe) is almost always used with the
tee program. Users of Unix-like operating systems
will find the “echo”, and “cat” commands useful to produce
input for the pipe to tee. Users of MS-DOS-related
operating systems will find the “echo” and “type” commands
to be useful in this regard.

If you wish to simply enter text directly to a file, then
you may use tee directly (i.e., without the pipe).
Users of Unix-like operating systems will need to use
CTRL-D to signal the end of input. Users of MS-DOS-related
operating systems will need to use CTRL-Z to signal the
end of input.

Using cp

The cp program allows you to copy the contents
from one file to another in our simulated file system.
If the destination file already exists, it will be overwritten.
This program is similar to the “cp” command in Unix-like
operating systems, and the “copy” command in MS-DOS-related
operating systems.

The general format of the “cp” command is

java cp input-file-name output-file-name

where
input-file-name

is the path-name for the file to be copied (i.e., the
source file, and

output-file-name

is the path-name for the file to be created (i.e., the
target file.

For example,
java cp /home/rayo/hello.txt /home/rayo/greeting.txt

creates a new file “/home/rayo/greeting.txt” by
copying to it the contents of file “/home/rayo/hello.txt”.

Using cat

The cat program reads the contents of a named file
and writes it to standard output. The cat program
is generally used to display the contents of a file.
This program is similar to the “cat” command in Unix-like
operating systems, or the “type” command in MS-DOS-related
operating systems.

The general format of the cat command line is

java cat file-name

where
file-name

is the name of the file from which data are to be
read for writing to standard output.

For example,
java cat /home/rayo/greeting.txt

causes the file “/home/rayo/greeting.txt” to be read,
the contents of which are written to standard output.
In this case, the output from the program might look
something like this

howdy, podner

Dumping the File System

While you are working with the file system simulator,
you may wish to dump the contents of the backing file
to see if it contains what you think
it contains. The dump program shows the contents
of a file in the operating environment, one byte at a
time, in various formats (hexadecimal, decimal, ASCII).

Note that dump dumps the contents of a real file,
not a file in our simulated file system.

The general format of the dump command line is

java dump file-name

where
file-name

is the name of the file to be dumped. This should
generally be the name of the backing file for the file
system simulator (e.g., “filesys.dat”).

The general format of the dump output is
addr hex dec asc

where
addr

is the decimal address of the byte,

hex

is the hexadecimal value of the byte,

dec

is the decimal value of the byte, and

asc

is the corresponding ASCII character if the
value is between 33 and 127 (decimal).

Each line of dump output corresponds to a single byte
in the file.
To keep the listing brief, dump only displays
non-zero bytes from the input file.

For example

java dump filesys.dat | more

causes the contents of the file “filesys.dat” to be
displayed, one line per byte. The “| more” causes
you to be prompted for each page of the output.
The first page of the output should look
something like this:

0 1 1
5 28 40 (
9 1 1
13 2 2
17 a 10
256 1f 31
512 40 64 @
515 3 3
523 30 48 0
527 ff 255
528 ff 255
529 ff 255
530 ff 255
531 ff 255
532 ff 255
533 ff 255
534 ff 255
535 ff 255
536 ff 255
537 ff 255
538 ff 255
539 ff 255
540 ff 255
541 ff 255

You should notice, for example, that the first block
(the super block) contains a few numeric values corresponding
to the block size (the 1 in the 0 byte means 256),
number of blocks, etc. The second block (starting at byte 256)
contains a few bits that are set, indicating that the first few
blocks are allocated. The third block (starting at 512)
contains a few index nodes; the FF/255 values indicate that
a direct block is unallocated. A little further down you
will see “.”, and “..” for the directory entries for the
root file system, and other data blocks.

Simulator Configuration File

Each file system simulator program must call Kernel.initialize()
before calling any of the other Kernel methods. The
initialize() method reads a configuration file
(“filesys.conf” is the default),
opens the backing file for the file system (“filesys.dat” is the default),
and performs other initializations.
This section of the user guide describes the
various options which may be set in the configuration file.

Configuration File Options

Name Description Default Value

filesystem.root.filename The name of the file containing the root file system for the
simulation. filesys.dat

filesystem.root.mode The mode to use when opening the root file system backing file.
The mode should either be “rw” for reading and writing, or “r” for
read-only access. rw

process.uid The numeric user id (uid) to use for the default
process context.
This should be a number between 0 and 32767. 1

process.gid The numeric group id (gid) to use for the default
process context.
This should be a number between 0 and 32767. 1

process.umask The umask to use for the default process context. This should
be an octal number between 000 and 777. 022

process.dir The working directory in the simulated file system to be used
for the default process context. This should be a string that
starts with “/”. /root

process.max_open_files The maximum number of files that may be open at a
time by a process.
When a process context is created, this many slots are created for
possible open files. 10

kernel.max_open_files The maximum number of files that may be open at one time by
all processes in the simulation. When the simulator starts, this
many slots are created for possible open files. 20

A Sample Configuration File

In addition to the standard configuration file, “filesys.conf”,
the distribution also includes a smaller sample configuration
file, “sample.conf”. This is shown below to illustrate a typical
configuration file.

!
! my personal filesys configuration file
!
filesystem.root.filename = rayo.dat
filesystem.root.mode = r
process.uid = 1000
process.gid = 1000
process.umask = 002
process.dir = /home/rayo

In this particular example, the file system is contained in the
backing file “rayo.dat”, which is here being opened for read-only
access. The working directory for the default process context
is “/home/rayo”, with the uid, gid, and umask shown.

Specifying an Alternate Configuration File

The default configuration file is named “filesys.conf” and is
included in the application distribution. You may modify this
file directly to set various options, or you may create your
own configuration file and specify the name of this new file
when you launch your simulator programs.

If you choose to create your own configuration file, you
will need to define a system property “filesys.conf”
which contains the name of file. For example, suppose you
wanted to run the “ls” program using “my_filesys.conf” as the
configuration file. Your java command would look something
like this:

java -Dfilesys.conf=my_filesys.conf ls /home

If there is no value set for the “filesys.conf” system property,
then the name “filesys.conf” is used as the default configuration
filename.

Writing File System Simulator Programs

Writing programs that use the File System Simulator
requires the use of the Kernel class,
and may involve the use of the classes
Stat and DirectoryEntry.
If you’re writing ordinary programs that use the
standard file system calls, you should not need to reference
any other classes.

These three classes are described briefly here. For more
information, follow the link for the class to the javadoc
for that class.

Kernel

sets up the simulator environment and defines all the
system calls. This class defines: the method
initialize(), which is used to initialize
the file system simulator; the creat(), open(),
read(), write(), close(),
and other methods which simulate the work of a file system;
and constants like EBADF, S_IFDIR, and
O_RDONLY which are used to represent parameter or
return values for the system calls. All the methods and
fields of Kernel are static; you do not instantiate a
Kernel object. For examples, see any of the sample
programs (i.e., cat.java, cp.java,
ls.java, etc.)

Stat

is a data structure that represents information about a
file or directory. This intends to faithfully represent
the Unix stat struct. You may reference fields
within a stat object directly (e.g., stat.st_ino),
or using JavaBean-style accessor/mutator methods (e.g.,
stat.getIno() or stat.setIno(). Stat
objects are updated by the methods
Kernel.stat() and Kernel.fstat().
For examples, see mkdir.java.

DirectoryEntry

is a data
structure that represents a single record in a directory
file. This intends to faithfully represent a Unix
dirent struct. It contains an index node number and
a file name. You may reference the fields directly (e.g.,
dirent.d_ino), or using JavaBean-style accessor/mutator
methods (e.g., dirent.getIno() or dirent.setIno()).
However, Java programmers my find it more convenient to use
the getName() and setName()
(which use String)
instead of the field d_name (which is byte[]).
DirectoryEntry objects are updated by
the method Kernel.readdir(). For examples, see
mkdir.java and ls.java.

For more information about Unix system calls and the
stat and dirent structs, refer to a
Unix system manual. Users of Unix-like systems may
find the commands “man -S 2 creat”,
“man -S 2 open”, etc. to be helpful.
All programs that use the File System Simulator should
adhere to the following guidelines:

Invoke the method Kernel.initialize()
before any other File System Simulator calls.

Use Kernel.exit() when you wish to
terminate processing in your program.

Check for errors after each system call (e.g.,
creat(), open(), read(),
write(), etc.).
Nearly all the system calls return -1 if an error
occurs.

Use Kernel.perror()
to print the message associated with an error.

Use Kernel.getErrno()
to determine which error occurred, if needed. Note that in standard
Unix programs you would reference the static process
variable “errno”.

For examples, take a look at the following sample programs
in the distribution:

cat.java

cp.java

ls.java

mkdir.java

tee.java

Collectively, these sample programs invoke all of the core methods
(system calls) of the file system simulator.

Enhancing the File System Simulator

Adding new features to the File System Simulator
is an excellent way to probe your understanding of
file system operation, and to investigate new features.
Enhancements will almost certainly require changes
to the class Kernel, and may necessitate
changes to the sample programs described above.
This section describes the other classes that
implement the functionality of the simulator so
that you may understand the intended organization
of these components when making a proposed enhancement.

The following are the internal classes for
the file system simulator:

BitBlock

is a data structure that views a device block as a
sequence of bits. The methods setBit(),
resetBit(), and isBitSet() are used
to set, reset, or check a bit in the block.
This structure is used to implement
bitmaps, and is used by the file system simulator to
track allocated and free data blocks in the file system.
BitBlock extends Block.

Block

is a data structure that views a device block as a
sequence of bytes. The field bytes is an array
of byte, and is directly accessible. Included
are methods to read() and write() the
block to a java.io.RandomAccessFile, which
simulate the action of reading or writing a device block.

FileDescriptor

is a structure and collection of methods that represent
an open file. It includes a number of get and
set methods for various tidbits of information
about the open file, and provides readBlock
and writeBlock() methods for reading and writing
the blocks of the file.

FileSystem

is a structure and collection of methods that represent
an open (mounted) file system. It includes a few get
and set methods for various fields about the file
system, but more importantly, includes methods to open()
the file behind the file system, to read() and
write() blocks of the device, to manage blocks
(allocateBlock() and freeBlock()) and
to manage inodes (allocateIndexNode()). In general,
Kernel methods should call FileSystem
methods when they want to read or write data in the file system.

IndexNode

is a structure and collection of methods
for representing an index node. This is
meant to reflect the exact structure on disk for an index
node. It includes get and set methods
for each of the fields in the index node. Also included
are read() and write() methods which
are used to copy data to and from byte arrays (not disk files).

ProcessContext

is a structure and collection of methods to represent
a process. This is where the simulator stores the
uid, gid, umask, dir, and other information for the
current process. It includes get and set
methods for each of the fields in a process.

SuperBlock

is a structure and collection of methods for representing
the superblock on the disk. In our implementation, the
superblock contains information about the block size,
number of blocks, offsets to the first block of the free
list, inode block, and data block areas of the device.
It includes get and set methods
for each of the fields in the superblock. Also
included are methods to read() and write()
the superblock.

Of course, you should look at the code and plan your enhancements
carefully.

Suggested Exercises

Use mkfs to create a file system with a block size of 64 bytes
and having a total of 8 blocks. How many index nodes will fit in a block?
How many directory entries will fit in a block? Use dump to examine
the file system backing file, and note the value in byte 64. What does this
value represent? Use mkdir to create a directory (e.g., /usr),
and then use dump to examine byte 64 again. What do you notice? Repeat
the process of creating a directory (e.g., /bin, /lib, /var,
/etc, /home, /mnt, etc.) and examining with dump.
How many directories can you create before you fill up the file system? Explain
why.

filesys/FileSystem.class
public synchronized class FileSystem {
private java.io.RandomAccessFile file;
private String filename;
private String mode;
private short blockSize;
private int blockCount;
private int freeListBlockOffset;
private int inodeBlockOffset;
private int dataBlockOffset;
private IndexNode rootIndexNode;
public static short ROOT_INDEX_NODE_NUMBER;
public static int NOT_A_BLOCK;
private int currentFreeListBitNumber;
private int currentFreeListBlock;
private BitBlock freeListBitBlock;
private short currentIndexNodeNumber;
private short currentIndexNodeBlock;
private byte[] indexBlockBytes;
public void FileSystem(String, String) throws java.io.IOException;
public short getBlockSize();
public int getFreeListBlockOffset();
public int getInodeBlockOffset();
public int getDataBlockOffset();
public IndexNode getRootIndexNode();
public void open() throws java.io.IOException;
public void close() throws java.io.IOException;
public void read(byte[], int) throws java.io.IOException;
public void write(byte[], int) throws java.io.IOException;
public void freeBlock(int) throws java.io.IOException;
public int allocateBlock() throws java.io.IOException;
private void loadFreeListBlock(int) throws java.io.IOException;
public short allocateIndexNode() throws java.io.IOException;
public void readIndexNode(IndexNode, short) throws java.io.IOException;
public void writeIndexNode(IndexNode, short) throws java.io.IOException;
private void loadIndexNodeBlock(short) throws java.io.IOException;
static void ();
}

filesys/FileSystem.java

/**

* A simulated file system.

import
java
.
io
.
RandomAccessFile

;

import
java
.
io
.
IOException

;

public

class

FileSystem

{

private

RandomAccessFile
file
=

null

;

private

String
filename
=

null

;

private

String
mode
=

null

;

private

short
blockSize
=

0

;

private

int
blockCount
=

0

;

private

int
freeListBlockOffset
=

0

;

private

int
inodeBlockOffset
=

0

;

private

int
dataBlockOffset
=

0

;

private

IndexNode
rootIndexNode
=

null

;

public

static

short
ROOT_INDEX_NODE_NUMBER
=

0

;

public

static

int
NOT_A_BLOCK
=

0x00FFFFFF

;

/**

* Construct a FileSystem and open a FileSystem file.

*
@param
newFilename the name of the FileSystem file to open

*
@param
newMode the mode (“r” or “rw”) to use when opening the file

*
@exception
java.io.IOException if any IOExceptions are thrown

* during the open.

public

FileSystem
(

String
newFilename
,

String
newMode
)

throws

IOException

{

super
()

;

filename
=
newFilename
;

mode
=
newMode
;

open
()

;

}

/**

* Get the blockSize for this FileSystem.

*
@return
the block size in bytes

public

short
getBlockSize
()

{

return
blockSize
;

}

public

int
getFreeListBlockOffset
()

{

return
freeListBlockOffset
;

}

public

int
getInodeBlockOffset
()

{

return
inodeBlockOffset
;

}

public

int
getDataBlockOffset
()

{

return
dataBlockOffset
;

}

/**

* Get the rootIndexNode for this FileSystem.

*
@return
the root index node

public

IndexNode
getRootIndexNode
()

{

return
rootIndexNode
;

}

/**

* Open a backing file for this FileSystem and read the superblock.

*
@exception
java.io.IOException if the open or read causes

* IOException to be thrown

public

void
open
()

throws

IOException

{

file
=

new

RandomAccessFile
(
filename
,
mode
)

;

// read the block size and other information from the superblock

SuperBlock
superBlock
=

new

SuperBlock
()

;

superBlock
.
read
(
file
)

;

blockSize
=
superBlock
.
getBlockSize
()

;

blockCount
=
superBlock
.
getBlocks
()

;

// ??? inodeCount

freeListBlockOffset
=
superBlock
.
getFreeListBlockOffset
()

;

inodeBlockOffset
=
superBlock
.
getInodeBlockOffset
()

;

dataBlockOffset
=
superBlock
.
getDataBlockOffset
()

;

// initialize free list block buffer

freeListBitBlock
=

new

BitBlock
(
blockSize
)

;

// initialize index block buffer

indexBlockBytes
=

new

byte
[
blockSize
]

;

// read the root index node

rootIndexNode
=

new

IndexNode
()

;

readIndexNode
(
rootIndexNode
,
ROOT_INDEX_NODE_NUMBER
)

;

}

/**

* Close the backing file for this FileSystem, if any.

*
@exception
java.io.IOException if the closing the backing

* file causes any IOException to be thrown

public

void
close
()

throws

IOException

{

if
(
file
!=

null

)

file
.
close
()

;

}

/**

* Read bytes into a buffer from the specified absolute block number

* of the file system.

*
@param
bytes the byte buffer into which the block should be read

*
@param
blockNumber the absolute block number which should be read

*
@exception
java.io.IOException if there are any exceptions during

* the read from the underlying “file system” file.

public

void
read
(

byte
[]
bytes
,

int
blockNumber
)

throws

IOException

{

file
.
seek
(
blockNumber
*
blockSize
)

;

file
.
readFully
(
bytes
)

;

}

/**

* Write bytes from a buffer to the specified absolute block number

* of the file system.

*
@param
bytes the byte buffer from which the block should be written

*
@param
blockNumber the absolute block number which should be written

*
@exception
java.io.IOException if there are any exceptions during

* the write to the underlying “file system” file.

public

void
write
(

byte
[]
bytes
,

int
blockNumber
)

throws

IOException

{

file
.
seek
(
blockNumber
*
blockSize
)

;

file
.
write
(
bytes
)

;

}

private

int
currentFreeListBitNumber
=

0

;

private

int
currentFreeListBlock
=

–
1

;

private

BitBlock
freeListBitBlock
=

null

;

/**

* Mark a data block as being free in the free list.

*
@param
dataBlockNumber the data block which is to be marked free

*
@exception
java.io.IOException if any exception occurs during an

* operation on the underlying “file system” file.

public

void
freeBlock
(

int
dataBlockNumber
)

throws

IOException

{

loadFreeListBlock
(
dataBlockNumber
)

;

freeListBitBlock
.
resetBit
(
dataBlockNumber
%

(
blockSize
*

8

)

)

;

file
.
seek
(

(
freeListBlockOffset
+
currentFreeListBlock
)

*

blockSize
)

;

freeListBitBlock
.
write
(
file
)

;

}

/**

* Allocate a data block from the list of free blocks.

*
@return
the data block number which was allocated; -1 if no blocks

* are available

*
@exception
java.io.IOException if any exception occurs during an

* operation on the underlying “file system” file.

public

int
allocateBlock
()

throws

IOException

{

// from our current position in the free list block,

// scan until we find an open position. If we get back to

// where we started, there are no free blocks and we return

// -1.

int
save
=
currentFreeListBitNumber
;

while
(

true

)

{

loadFreeListBlock
(
currentFreeListBitNumber
)

;

boolean
allocated
=
freeListBitBlock
.
isBitSet
(

currentFreeListBitNumber
%

(
blockSize
*

8

)

)

;

int
previousFreeListBitNumber
=
currentFreeListBitNumber
;

currentFreeListBitNumber
++

;

// if curr bit number >= data block count, set to 0

if
(
currentFreeListBitNumber
>=

(
blockCount
–
dataBlockOffset
)

)

currentFreeListBitNumber
=

0

;

if
(

!
allocated
)

{

freeListBitBlock
.
setBit
(
previousFreeListBitNumber
%

(
blockSize
*

8

)

)

;

file
.
seek
(

(
freeListBlockOffset
+
currentFreeListBlock
)

*

blockSize
)

;

freeListBitBlock
.
write
(
file
)

;

return
previousFreeListBitNumber
;

}

if
(
save
==
currentFreeListBitNumber
)

{

Kernel
.
setErrno
(

Kernel
.
ENOSPC
)

;

return

–
1

;

}

/**

* Loads the block containing the specified data block bit into

* the free list block buffer. This is a convenience method.

*
@param
dataBlockNumber the data block number

*
@exception
java.io.IOException

private

void
loadFreeListBlock
(

int
dataBlockNumber
)

throws

IOException

{

int
neededFreeListBlock
=
dataBlockNumber
/

(
blockSize
*

8

)

;

if
(
currentFreeListBlock
!=
neededFreeListBlock
)

{

file
.
seek
(

(
freeListBlockOffset
+
neededFreeListBlock
)

*

blockSize
)

;

freeListBitBlock
.
read
(
file
)

;

currentFreeListBlock
=
neededFreeListBlock
;

}

/**

* The index node number that will next be checked to see

* if it is available.

private

short
currentIndexNodeNumber
=

0

;

/**

* The number of the index node block which is currently

* loaded into indexBlockBytes. If no block is loaded,

* this contains the value “-1”.

private

short
currentIndexNodeBlock
=

–
1

;

/**

* The byte buffer used for reading and writing

* index node blocks. You can think of this as

* a one-block cache.

private

byte
[]
indexBlockBytes
=

null

;

/**

* Allocate an index node for the file system.

*
@return
the inode number for the next available index node;

* -1 if there are no index nodes available.

*
@exception
java.io.IOException if there is an exception during

* an operation on the underlying “file system” file.

public

short
allocateIndexNode
()

throws

IOException

{

// from our current position in the index node block list,

// scan until we find an open position. If we get back to

// where we started, there are no free inodes and we return

// -1.

short
save
=
currentIndexNodeNumber
;

IndexNode
temp
=

new

IndexNode
()

;

while
(

true

)

{

readIndexNode
(
temp
,
currentIndexNodeNumber
)

;

short
previousIndexNodeNumber
=
currentIndexNodeNumber
;

currentIndexNodeNumber
++

;

// if curr inode >= avail inode space, set to 0

if
(
currentIndexNodeNumber
>=

(

(
dataBlockOffset
–
inodeBlockOffset
)

*

(
blockSize
/

IndexNode
.
INDEX_NODE_SIZE
)

)

)

currentIndexNodeNumber
=

0

;

if
(
temp
.
getNlink
()

==

0

)

{

// ??? should we update nlinks here?

return
previousIndexNodeNumber
;

}

if
(
save
==
currentIndexNodeNumber
)

{

// ??? it seems like we should give a different error here

Kernel
.
setErrno
(

Kernel
.
ENOSPC
)

;

return

–
1

;

}

/**

* Reads an index node at the index node location specified.

*
@param
indexNode the index node

*
@param
indexNodeNumber the location

*
@execption
java.io.IOException if any exception occurs in an

* underlying operation on the “file system” file.

public

void
readIndexNode
(

IndexNode
indexNode
,

short
indexNodeNumber
)

throws

IOException

{

loadIndexNodeBlock
(
indexNodeNumber
)

;

indexNode
.
read
(
indexBlockBytes
,

(
indexNodeNumber
*

IndexNode
.
INDEX_NODE_SIZE
)

%

blockSize
)

;

}

/**

* Writes an index node at the index node location specified.

*
@param
indexNode the index node

*
@param
indexNodeNumber the location

*
@execption
java.io.IOException if any exception occurs in an

* underlying operation on the “file system” file.

public

void
writeIndexNode
(

IndexNode
indexNode
,

short
indexNodeNumber
)

throws

IOException

{

loadIndexNodeBlock
(
indexNodeNumber
)

;

indexNode
.
write
(
indexBlockBytes
,

(
indexNodeNumber
*

IndexNode
.
INDEX_NODE_SIZE
)

%

blockSize
)

;

write
(
indexBlockBytes
,
inodeBlockOffset
+
currentIndexNodeBlock
)

;

}

/**

* Loads the block containing the specified index node into

* the index node block buffer. This is a convenience method.

*
@param
indexNodeNumber the index node number

*
@exception
java.io.IOException

private

void
loadIndexNodeBlock
(

short
indexNodeNumber
)

throws

IOException

{

short
neededIndexNodeBlock
=

(
short
)(
indexNodeNumber
/

(
blockSize
/

IndexNode
.
INDEX_NODE_SIZE
)

)

;

if
(
currentIndexNodeBlock
!=
neededIndexNodeBlock
)

{

read
(
indexBlockBytes
,
inodeBlockOffset
+
neededIndexNodeBlock
)

;

currentIndexNodeBlock
=
neededIndexNodeBlock
;

}

__MACOSX/filesys/._FileSystem.java

filesys/IndexNode.class
public synchronized class IndexNode {
public static final int INDEX_NODE_SIZE = 64;
public static final int MAX_DIRECT_BLOCKS = 10;
public static final int MAX_FILE_BLOCKS = 10;
private short mode;
private short nlink;
private short uid;
private short gid;
private int size;
private int[] directBlocks;
private int indirectBlock;
private int doubleIndirectBlock;
private int tripleIndirectBlock;
private int atime;
private int mtime;
private int ctime;
public void IndexNode();
public void setMode(short);
public short getMode();
public void setNlink(short);
public short getNlink();
public void setUid(short);
public short getUid();
public short getGid();
public void setGid(short);
public void setSize(int);
public int getSize();
public int getBlockAddress(int) throws Exception;
public void setBlockAddress(int, int) throws Exception;
public void setAtime(int);
public int getAtime();
public void setMtime(int);
public int getMtime();
public void setCtime(int);
public int getCtime();
public void write(byte[], int);
public void read(byte[], int);
public String toString();
public void copy(IndexNode);
public static void main(String[]) throws Exception;
}

filesys/IndexNode.java

/**

* An index node for a simulated file system.

public

class

IndexNode

{

/**

* Size of each index node in bytes.

public

static

final

int
INDEX_NODE_SIZE
=

64

;

/**

* Maximum number of direct blocks in an index node.

public

static

final

int
MAX_DIRECT_BLOCKS
=

10

;

/**

* Maximum number of blocks in a file. If indirect,

* doubleIndirect, or tripleIndirect blocks are implemented,

* this number will need to be increased.

public

static

final

int
MAX_FILE_BLOCKS
=
MAX_DIRECT_BLOCKS
;

/**

* Mode for this index node. This includes file type and file protection

* information.

private

short
mode
=

0

;

* Not yet implemented.

* Number of links to this file.

private

short
nlink
=

0

;

* Not yet implemented.

* Owner’s user id.

private

short
uid
=

0

;

* Not yet implemented.

* Owner’s group id.

private

short
gid
=

0

;

/**

* Number of bytes in this file.

private

int
size
=

0

;

/**

* Array of direct blocks containing the block addresses for the

* first MAX_DIRECT_BLOCKS blocks of the file. Note that each

* element in the array is stored as a 3-byte number on disk.

private

int
directBlocks
[]

=

{

FileSystem
.
NOT_A_BLOCK

,

FileSystem
.
NOT_A_BLOCK

,

FileSystem
.
NOT_A_BLOCK
}

;

* Not yet implemented.

private

int
indirectBlock
=

FileSystem
.
NOT_A_BLOCK
;

* Not yet implemented.

private

int
doubleIndirectBlock
=

FileSystem
.
NOT_A_BLOCK
;

* Not yet implemented.

private

int
tripleIndirectBlock
=

FileSystem
.
NOT_A_BLOCK
;

* Not yet implemented.

* The date and time at which this file was last accessed.

* This is traditionally implemented as the number of seconds

* past 1970/01/01 00:00:00

private

int
atime
=

0

;

* Not yet implemented.

* The date and time at which this file was last modified.

* This is traditionally implemented as the number of seconds

* past 1970/01/01 00:00:00

private

int
mtime
=

0

;

* Not yet implemented.

* The date and time at which this file was created.

* This is traditionally implemented as the number of seconds

* past 1970/01/01 00:00:00

private

int
ctime
=

0

;

/**

* Creates an index node.

public

IndexNode
()

{

super
()

;

}

/**

* Sets the mode for this IndexNode.

* This is the file type and file protection information.

public

void
setMode
(

short
newMode
)

{

mode
=
newMode
;

}

/**

* Gets the mode for this IndexNode.

* This is the file type and file protection information.

public

short
getMode
()

{

return
mode
;

}

/**

* Set the number of links for this IndedNode.

*
@param
newNlink the number of links

public

void
setNlink
(

short
newNlink
)

{

nlink
=
newNlink
;

}

/**

* Get the number of links for this IndexNode.

*
@return
the number of links

public

short
getNlink
()

{

return
nlink
;

}

public

void
setUid
(

short
newUid
)

{

uid
=
newUid
;

}

public

short
getUid
()

{

return
uid
;

}

public

short
getGid
()

{

return
gid
;

}

public

void
setGid
(

short
newGid
)

{

gid
=
newGid
;

}

/**

* Sets the size for this IndexNode.

* This is the number of bytes in the file.

public

void
setSize
(

int
newSize
)

{

size
=
newSize
;

}

/**

* Gets the size for this IndexNode.

* This is the number of bytes in the file.

public

int
getSize
()

{

return
size
;

}

/**

* Gets the address corresponding to the specified

* sequential block of the file.

*
@param
block the sequential block number

*
@return
the address of the block, a number between zero and one

* less than the number of blocks in the file system

*
@exception
java.lang.Exception if the block number is invalid

public

int
getBlockAddress
(

int
block
)

throws

Exception

{

if
(
block
>=

0

&&
block
< MAX_DIRECT_BLOCKS )        return ( directBlocks [ block ] ) ;      else        throw new Exception ( "invalid block address " + block ) ;    }    /**    * Sets the address corresponding to the specified sequential    * block of the file.    * @param block the sequential block number    * @param address the address of the block, a number between zero and one    * less than the number of blocks in the file system    * @exception java.lang.Exception if the block number is invalid    */    public void setBlockAddress ( int block , int address ) throws Exception    {      if ( block >=

0

&&
block
< MAX_DIRECT_BLOCKS )       directBlocks [ block ] = address ;      else        throw new Exception ( "invalid block address " + block ) ;    }    public void setAtime ( int newAtime )    {     atime = newAtime ;    }    public int getAtime ()    {      return atime ;    }    public void setMtime ( int newMtime )    {     mtime = newMtime ;    }    public int getMtime ()    {      return mtime ;    }    public void setCtime ( int newCtime )    {     ctime = newCtime ;    }    public int getCtime ()    {      return ctime ;    }    /**    * Writes the contents of an index node to a byte array.    * This is used to copy the bytes which correspond to the    * disk image of the index node onto a block buffer so that    * they may be written to the file system.    * @param buffer the buffer to which bytes should be written    * @param offset the offset from the beginning of the buffer    * at which bytes should be written    */    public void write ( byte [] buffer , int offset )    {      // write the mode info     buffer [ offset ] = ( byte )( mode >>>

8

)

;

buffer
[
offset
+
1
]

=

(
byte
)
mode
;

// write nlink

buffer
[
offset
+
2
]

=

(
byte
)(
nlink
>>>

8

)

;

buffer
[
offset
+
3
]

=

(
byte
)
nlink
;

// write uid

buffer
[
offset
+
4
]

=

(
byte
)(
uid
>>>

8

)

;

buffer
[
offset
+
5
]

=

(
byte
)
uid
;

// write gid

buffer
[
offset
+
6
]

=

(
byte
)(
gid
>>>

8

)

;

buffer
[
offset
+
7
]

=

(
byte
)
gid
;

// write the size info

buffer
[
offset
+
8
]

=

(
byte
)(
size
>>>

24

)

;

buffer
[
offset
+
8
+
1
]

=

(
byte
)(
size
>>>

16

)

;

buffer
[
offset
+
8
+
2
]

=

(
byte
)(
size
>>>

8

)

;

buffer
[
offset
+
8
+
3
]

=

(
byte
)(
size
)

;

// write the directBlocks info 3 bytes at a time

for
(

int
i
=

0

;
i
< MAX_DIRECT_BLOCKS ; i ++ ) { buffer [ offset + 12 + 3 * i ] = ( byte )( directBlocks [ i ] >>>

16

)

;

buffer
[
offset
+
12
+
3
*
i
+
1
]

=

(
byte
)(
directBlocks
[
i
]

>>>

8

)

;

buffer
[
offset
+
12
+
3
*
i
+
2
]

=

(
byte
)(
directBlocks
[
i
]

)

;

}

// leave room for indirectBlock, doubleIndirectBlock, tripleIndirectBlock

// leave room for atime, mtime, ctime

}

/**

* Reads the contents of an index node from a byte array.

* This is used to copy the bytes which correspond to the

* disk image of the index node from a block buffer that

* has been read from the file system.

*
@param
buffer the buffer from which bytes should be read

*
@param
offset the offset from the beginning of the buffer

* at which bytes should be read

public

void
read
(

byte
[]
buffer
,

int
offset
)

{

int
b3
;

int
b2
;

int
b1
;

int
b0
;

// read the mode info

b1
=
buffer
[
offset
]

&

0xff

;

b0
=
buffer
[
offset
+
1
]

&

0xff

;

mode
=

(
short
)(
b1
<< 8 | b0 ) ; // read the nlink info b1 = buffer [ offset + 2 ] & 0xff ; b0 = buffer [ offset + 3 ] & 0xff ; nlink = ( short )( b1 << 8 | b0 ) ; // read the uid info b1 = buffer [ offset + 4 ] & 0xff ; b0 = buffer [ offset + 5 ] & 0xff ; uid = ( short )( b1 << 8 | b0 ) ; // read the gid info b1 = buffer [ offset + 6 ] & 0xff ; b0 = buffer [ offset + 7 ] & 0xff ; gid = ( short )( b1 << 8 | b0 ) ; // read the size info b3 = buffer [ offset + 8 ] & 0xff ; b2 = buffer [ offset + 8 + 1 ] & 0xff ; b1 = buffer [ offset + 8 + 2 ] & 0xff ; b0 = buffer [ offset + 8 + 3 ] & 0xff ; size = b3 << 24 | b2 << 16 | b1 << 8 | b0 ; // read the block address info 3 bytes at a time for ( int i = 0 ; i < MAX_DIRECT_BLOCKS ; i ++ ) { b2 = buffer [ offset + 12 + i * 3 ] & 0xff ; b1 = buffer [ offset + 12 + i * 3 + 1 ] & 0xff ; b0 = buffer [ offset + 12 + i * 3 + 2 ] & 0xff ; directBlocks [ i ] = b2 << 16 | b1 << 8 | b0 ; } // leave room for indirectBlock, doubleIndirectBlock, tripleIndirectBlock // leave room for atime, mtime, ctime } /** * Converts an index node into a printable string. * @return the printable string */ public String toString () { StringBuffer s = new StringBuffer ( "IndexNode[" ) ; s . append ( mode ) ; s . append ( ',' ) ; s . append ( '{' ) ; for ( int i = 0 ; i < MAX_DIRECT_BLOCKS ; i ++ ) { if ( i >

0

)

s
.
append
(

‘,’

)

;

s
.
append
(
directBlocks
[
i
]

)

;

}

s
.
append
(

‘}’

)

;

s
.
append
(

‘]’

)

;

return
s
.
toString
()

;

}

public

void
copy
(

IndexNode
indexNode
)

{

indexNode
.
mode
=
mode
;

indexNode
.
nlink
=
nlink
;

indexNode
.
uid
=
uid
;

indexNode
.
gid
=
gid
;

indexNode
.
size
=
size
;

for
(

int
i
=

0

;
i
< MAX_DIRECT_BLOCKS ; i ++ ) indexNode . directBlocks [ i ] = directBlocks [ i ] ; indexNode . indirectBlock = indirectBlock ; indexNode . doubleIndirectBlock = doubleIndirectBlock ; indexNode . tripleIndirectBlock = tripleIndirectBlock ; indexNode . atime = atime ; indexNode . mtime = mtime ; indexNode . ctime = ctime ; } /** * A test driver for IndexNode. * @exception java.lang.Exception any exception which may occur */ public static void main ( String [] args ) throws Exception { byte [] buffer = new byte [ 512 ] ; IndexNode root = new IndexNode () ; root . setMode ( Kernel . S_IFDIR ) ; root . setBlockAddress ( 0 , 33 ) ; System . out . println ( root . toString () ) ; IndexNode copy = new IndexNode () ; root . write ( buffer , 0 ) ; copy . read ( buffer , 0 ) ; System . out . println ( copy . toString () ) ; } } __MACOSX/filesys/._IndexNode.java filesys/Kernel.class public synchronized class Kernel { public static final String PROGRAM_NAME = Kernel; public static final int EPERM = 1; public static final int ENOENT = 2; public static final int EBADF = 9; public static final int EACCES = 13; public static final int EEXIST = 17; public static final int EXDEV = 18; public static final int ENOTDIR = 20; public static final int EISDIR = 21; public static final int EINVAL = 22; public static final int ENFILE = 23; public static final int EMFILE = 24; public static final int EFBIG = 27; public static final int ENOSPC = 28; public static final int EROFS = 30; public static final int EMLINK = 31; public static final int sys_nerr = 32; public static final String[] sys_errlist; public static final short S_IFMT = -4096; public static final short S_IFREG = -32768; public static final short S_IFMPB = 28672; public static final short S_IFBLK = 24576; public static final short S_IFDIR = 16384; public static final short S_IFMPC = 12288; public static final short S_IFCHR = 8192; public static final short S_ISUID = 2048; public static final short S_ISGID = 1024; public static final short S_ISVTX = 512; public static final short S_IRWXU = 448; public static final short S_IRUSR = 256; public static final short S_IREAD = 256; public static final short S_IWUSR = 128; public static final short S_IWRITE = 128; public static final short S_IXUSR = 64; public static final short S_IEXEC = 64; public static final short S_IRWXG = 56; public static final short S_IRGRP = 32; public static final short S_IWGRP = 16; public static final short S_IXGRP = 8; public static final short S_IRWXO = 7; public static final short S_IROTH = 4; public static final short S_IWOTH = 2; public static final short S_IXOTH = 1; public static final int O_RDONLY = 0; public static final int O_WRONLY = 1; public static final int O_RDWR = 2; private static ProcessContext process; private static int processCount; private static int MAX_OPEN_FILES; private static FileDescriptor[] openFiles; public static int MAX_OPEN_FILE_SYSTEMS; public static FileSystem[] openFileSystems; public static short ROOT_FILE_SYSTEM; private static int EXIT_FAILURE; private static int EXIT_SUCCESS; private static IndexNode rootIndexNode; public void Kernel(); public static void perror(String); public static void setErrno(int); public static int getErrno(); public static int close(int); public static int creat(String, short) throws Exception; public static void exit(int) throws Exception; public static int lseek(int, int, int); public static int open(String, int) throws Exception; private static int open(FileDescriptor); public static int read(int, byte[], int) throws Exception; public static int readdir(int, DirectoryEntry) throws Exception; public static int fstat(int, Stat) throws Exception; public static int stat(String, Stat) throws Exception; public static void sync(); public static int write(int, byte[], int) throws Exception; public static int writedir(int, DirectoryEntry) throws Exception; public static void initialize(); public static void finalize(int) throws Exception; private static int check_fd(int); private static int check_fd_for_read(int); private static int check_fd_for_write(int); private static String getFullPath(String); private static IndexNode getRootIndexNode(); private static short findNextIndexNode(FileSystem, IndexNode, String, IndexNode) throws Exception; private static short findIndexNode(String, IndexNode) throws Exception; static void ();
}

filesys/Kernel.java

* $Id: Kernel.java

* 456789012345678901234567890123456789012345678901234567890123456789012

import
java
.
util
.
StringTokenizer

;

import
java
.
util
.
Properties

;

import
java
.
io
.
FileInputStream

;

import
java
.
io
.
IOException

;

import
java
.
io
.
FileNotFoundException

;

/**

* Simulates a unix-like file system. Provides basic directory

* and file operations and implements them in terms of the underlying

* disk block structures.

public

class

Kernel

{

/**

* The name this program uses when displaying any error messages

* it generates internally.

public

static

final

String
PROGRAM_NAME
=

“Kernel”

;

/* Errors */

/**

* Not owner.

public

static

final

int
EPERM
=

1

;

/**

* No such file or directory.

public

static

final

int
ENOENT
=

2

;

/**

* Bad file number.

public

static

final

int
EBADF
=

9

;

/**

* Permission denied.

public

static

final

int
EACCES
=

13

;

/**

* File exists.

public

static

final

int
EEXIST
=

17

;

/**

* Cross-device link.

public

static

final

int
EXDEV
=

18

;

/**

* Not a directory.

public

static

final

int
ENOTDIR
=

20

;

/**

* Is a directory.

public

static

final

int
EISDIR
=

21

;

/**

* Invalid argument.

public

static

final

int
EINVAL
=

22

;

/**

* File table overflow.

public

static

final

int
ENFILE
=

23

;

/**

* Too many open files.

public

static

final

int
EMFILE
=

24

;

/**

* File too large.

public

static

final

int
EFBIG
=

27

;

/**

* No space left on device.

public

static

final

int
ENOSPC
=

28

;

/**

* Read-only file system.

public

static

final

int
EROFS
=

30

;

/**

* Too many links.

public

static

final

int
EMLINK
=

31

;

/**

* Number of errors messages defined in sys_errlist

* Simulates unix system variable:


   *   int sys_nerr;

   *

public

static

final

int
sys_nerr
=

32

;

/**

* The array of kernel error messages.

* Simulates unix system variable:


   *   const char *sys_errlist[];

   *

public

static

final

String
[]
sys_errlist
=

{

null

,

“Not owner”

,

“No such file or directory”

,

null

,

“Bad file number”

,

null

,

“Permission denied”

,

null

,

“File exists”

,

“Cross-device link”

,

null

,

“Not a directory”

,

“Is a directory”

,

“Invalid argument”

,

“File table overflow”

,

“Too many open files”

,

null

,

“File too large”

,

“No space left on device”

,

null

,

“Read-only file system”

,

“Too many links”

}

;

/**

* Prints a system error message. The actual text written

* to stderr is the

* given string, followed by a colon, a space, the message

* text, and a newline. It is customary to give the name of

* the program as the argument to perror.

*
@param
s the program name

public

static

void
perror
(

String
s
)

{

String
message
=

null

;

if

(

(
process
.
errno
>

0

)

&&

(
process
.
errno
< sys_nerr ) ) message = sys_errlist [ process . errno ] ; if ( message == null ) System . err . println ( s + ": unknown errno " + process . errno ) ; else System . err . println ( s + ": " + message ) ; } /** * Set the value of errno for the current process. *

* Simulates the unix variable:


   *   extern int errno ;

   *

*
@see
getErrno

public

static

void
setErrno
(

int
newErrno
)

{

if
(
process
==

null

)

{

System
.
err
.
println
(
PROGRAM_NAME
+

“: no current process in setErrno()”

)

;

System
.
exit
(
EXIT_FAILURE
)

;

}

process
.
errno
=
newErrno
;

}

/**

* Get the value of errno for the current process.

* Simulates the unix variable:


   *   extern int errno ;

   *

*
@see
setErrno

public

static

int
getErrno
()

{

if
(
process
==

null

)

{

System
.
err
.
println
(
PROGRAM_NAME
+

“: no current process in getErrno()”

)

;

System
.
exit
(
EXIT_FAILURE
)

;

}

return
process
.
errno
;

}

/* Modes */

/**

* File type mask

public

static

final

short
S_IFMT
=

(
short
)
0170000

;

/**

* Regular file

public

static

final

short
S_IFREG
=

(
short
)
0100000

;

/**

* Multiplexed block special

public

static

final

short
S_IFMPB
=

070000

;

/**

* Block Special

public

static

final

short
S_IFBLK
=

060000

;

/**

* Directory

public

static

final

short
S_IFDIR
=

040000

;

/**

* Multiplexed character special

public

static

final

short
S_IFMPC
=

030000

;

/**

* Character special

public

static

final

short
S_IFCHR
=

020000

;

/**

* Set user id on execution

public

static

final

short
S_ISUID
=

04000

;

/**

* Set group id on execution

public

static

final

short
S_ISGID
=

02000

;

/**

* Save swapped text even after use

public

static

final

short
S_ISVTX
=

01000

;

/**

* User (file owner) has read, write and execute permission

public

static

final

short
S_IRWXU
=

0700

;

/**

* User has read permission

public

static

final

short
S_IRUSR
=

0400

;

/**

* User has read permission

public

static

final

short
S_IREAD
=

0400

;

/**

* User has write permission

public

static

final

short
S_IWUSR
=

0200

;

/**

* User has write permission

public

static

final

short
S_IWRITE
=

0200

;

/**

* User has execute permission

public

static

final

short
S_IXUSR
=

0100

;

/**

* User has execute permission

public

static

final

short
S_IEXEC
=

0100

;

/**

* Group has read, write and execute permission

public

static

final

short
S_IRWXG
=

070

;

/**

* Group has read permission

public

static

final

short
S_IRGRP
=

040

;

/**

* Group has write permission

public

static

final

short
S_IWGRP
=

020

;

/**

* Group has execute permission

public

static

final

short
S_IXGRP
=

010

;

/**

* Others have read, write and execute permission

public

static

final

short
S_IRWXO
=

07

;

/**

* Others have read permission

public

static

final

short
S_IROTH
=

04

;

/**

* Others have write permisson

public

static

final

short
S_IWOTH
=

02

;

/**

* Others have execute permission

public

static

final

short
S_IXOTH
=

01

;

/**

* Closes the specified file descriptor.

* Simulates the unix system call:


   *   int close(int fd);

   *

*
@param
fd the file descriptor of the file to close

*
@return
Zero if the file is closed; -1 if the file descriptor

* is invalid.

public

static

int
close
(
int
fd
)

{

// check fd

int
status
=
check_fd
(
fd
)

;

if
(
status
< 0 ) return status ; // remove the file descriptor from the kernel's list of open files for ( int i = 0 ; i < MAX_OPEN_FILES ; i ++ ) if ( openFiles [ i ] == process . openFiles [ fd ] ) { openFiles [ i ] = null ; break ; } // ??? is it an error if we didn't find the open file? // remove the file descriptor from the list. process . openFiles [ fd ] = null ; return 0 ; } /** * Creates a file or directory with the specified mode. *

* Creates a new file or prepares to rewrite an existing file.

* If the file does not exist, it is given the mode specified.

* If the file does exist, it is truncated to length zero.

* The file is opened for writing and its file descriptor is

* returned.

* Simulates the unix system call:


   *   int creat(const char *pathname, mode_t mode);

   *

*
@param
pathname the name of the file or directory to create

*
@param
mode the file or directory protection mode for the new file

*
@return
the file descriptor (a non-negative integer); -1 if

* a needed directory is not searchable, if the file does not

* exist and the directory in which it is to be created is not

* writable, if the file does exist and is unwritable, if the

* file is a directory, or if there are already too many open

* files.

*
@exception
java.lang.Exception if any underlying action causes

* an exception to be thrown

public

static

int
creat
(

String
pathname
,

short
mode
)

throws

Exception

{

// get the full path

String
fullPath
=
getFullPath
(
pathname
)

;

StringBuffer
dirname
=

new

StringBuffer
(

“/”

)

;

FileSystem
fileSystem
=
openFileSystems
[
ROOT_FILE_SYSTEM
]

;

IndexNode
currIndexNode
=
getRootIndexNode
()

;

IndexNode
prevIndexNode
=

null

;

short
indexNodeNumber
=

FileSystem
.
ROOT_INDEX_NODE_NUMBER
;

StringTokenizer
st
=

new

StringTokenizer
(
fullPath
,

“/”

)

;

String
name
=

“.”

;

// start at root node

while
(
st
.
hasMoreTokens
()

)

{

name
=
st
.
nextToken
()

;

if

(

!
name
.
equals
(
“”
)

)

{

// check to see if the current node is a directory

if
(

(
currIndexNode
.
getMode
()

&
S_IFMT
)

!=
S_IFDIR
)

{

// return (ENOTDIR) if a needed directory is not a directory

process
.
errno
=
ENOTDIR
;

return

–
1

;

}

// check to see if it is readable by the user

// ??? tbd

// return (EACCES) if a needed directory is not readable

if
(
st
.
hasMoreTokens
()

)

{

dirname
.
append
(
name
)

;

dirname
.
append
(

‘/’

)

;

}

// get the next inode corresponding to the token

prevIndexNode
=
currIndexNode
;

currIndexNode
=

new

IndexNode
()

;

indexNodeNumber
=
findNextIndexNode
(

fileSystem
,
prevIndexNode
,
name
,
currIndexNode
)

;

}

// ??? we need to set some fields in the file descriptor

int
flags
=
O_WRONLY
;

// ???

FileDescriptor
fileDescriptor
=

null

;

if

(
indexNodeNumber
< 0 ) { // file does not exist. We check to see if we can create it. // check to see if the prevIndexNode (a directory) is writeable // ??? tbd // return (EACCES) if the file does not exist and the directory // in which it is to be created is not writable currIndexNode . setMode ( mode ) ; currIndexNode . setNlink ( ( short ) 1 ) ; // allocate the next available inode from the file system short newInode = fileSystem . allocateIndexNode () ; if ( newInode == - 1 ) return - 1 ; fileDescriptor = new FileDescriptor ( fileSystem , currIndexNode , flags ) ; // assign inode for the new file fileDescriptor.setIndexNodeNumber( newInode ) ; // System.out.println( "newInode = " + newInode ) ; fileSystem.writeIndexNode( currIndexNode , newInode ) ; // open the directory // ??? it would be nice if we had an "open" that took an inode // instead of a name for the dir// System.out.println( "dirname = " + dirname.toString() ) ; int dir = open( dirname.toString() , O_RDWR ) ; if( dir < 0 ) { Kernel.perror( PROGRAM_NAME ) ; System.err.println( PROGRAM_NAME + ": unable to open directory for writing" ); Kernel.exit( 1 ) ; // ??? is this correct } // scan past the directory entries less than the current entry // and insert the new element immediately following int status ; DirectoryEntry newDirectoryEntry = new DirectoryEntry( newInode , name ) ; DirectoryEntry currentDirectoryEntry = new DirectoryEntry() ; while( true ) { // read an entry from the directory status = readdir( dir , currentDirectoryEntry ) ; if( status < 0 ) { System.err.println( PROGRAM_NAME + ": error reading directory in creat" ) ; System.exit( EXIT_FAILURE ) ; } else if( status == 0 ) { // if no entry read, write the new item at the current // location and break writedir( dir , newDirectoryEntry ) ; break ; } else { // if current item > new item, write the new item in // place of the old one and break if( currentDirectoryEntry.getName().compareTo( newDirectoryEntry.getName() ) > 0 )

{

int seek_status = lseek( dir , – DirectoryEntry.DIRECTORY_ENTRY_SIZE , 1 ) ;

if( seek_status < 0 ) { System.err.println( PROGRAM_NAME + ": error during seek in creat" ) ; System.exit( EXIT_FAILURE ) ; } writedir( dir , newDirectoryEntry ) ; break ; } } } // copy the rest of the directory entries out to the file while ( status > 0 )

{

DirectoryEntry nextDirectoryEntry = new DirectoryEntry() ;

// read next item status = readdir( dir , nextDirectoryEntry ) ;

if( status > 0 )

{

// in its place int seek_status = lseek( dir , – DirectoryEntry.DIRECTORY_ENTRY_SIZE , 1 ) ;

if( seek_status < 0 ) { System.err.println( PROGRAM_NAME + ": error during seek in creat" ) ; System.exit( EXIT_FAILURE ) ; } } // write current item writedir( dir , currentDirectoryEntry ) ; // current item = next item currentDirectoryEntry = nextDirectoryEntry ; } // close the directory close( dir ) ; } else { // file does exist ( indexNodeNumber >= 0 )

// if it’s a directory, we can’t truncate it if( ( currIndexNode.getMode() & S_IFMT ) == S_IFDIR )

{

// return (EISDIR) if the file is a directory process.errno = EISDIR ;

return -1 ;

}

// check to see if the file is writeable by the user // ??? tbd // return (EACCES) if the file does exist and is unwritable

// free any blocks currently allocated to the file int blockSize = fileSystem.getBlockSize() ;

int blocks = ( currIndexNode.getSize() + blockSize – 1 ) /

blockSize ;

for( int i = 0 ; i < blocks ; i ++ ) { int address = currIndexNode.getBlockAddress( i ) ; if( address != FileSystem.NOT_A_BLOCK ) { fileSystem.freeBlock( address ) ; currIndexNode.setBlockAddress( i , FileSystem.NOT_A_BLOCK ) ; } } // update the inode to size 0 currIndexNode.setSize( 0 ) ; // write the inode to the file system. fileSystem.writeIndexNode( currIndexNode , indexNodeNumber ) ; // set up the file descriptor fileDescriptor = new FileDescriptor( fileSystem , currIndexNode , flags ) ; // assign inode for the new file fileDescriptor.setIndexNodeNumber( indexNodeNumber ) ; } return open( fileDescriptor ) ; } /** * Terminate the current "process". Any open files will be closed. *

* Simulates the unix system call: *

   *   exit(int status);   *

* Note: If this is the last process to terminate, this method * calls finalize(). * @param status the exit status * @exception java.lang.Exception if any underlying * Exception is thrown */

public static void exit( int status )

throws Exception

{

// close anything that might be open for the current process for( int i = 0 ; i < process.openFiles.length ; i ++ ) if( process.openFiles[i] != null ) { close( i ) ; } // terminate the process process = null ; processCount -- ; // if this is the last process to end, call finalize if( processCount <= 0 ) finalize( status ) ; } /** * Set the current file pointer for a file. * The current file position is updated based on the values of * offset and whence. If whence is 0, the new position is * offset bytes from the beginning of the file. If whence is * 1, the new position is the current position plus the value * of offset. If whence is 2, the new position is the size * of the file plus the offset value. Note that offset may be * negative if whence is 1 or 2, as long as the resulting * position is not less than zero. It is valid to position * past the end of the file, but it is not valid to read * past the end of the file. *

* Simulates the unix system call: *

   *   lseek( int filedes , int offset , int whence );   *

* @param fd the file descriptor * @param offset the offset * @param whence 0 = from beginning of file; 1 = from * current position ; 2 = from end of file */

public static int lseek( int fd , int offset , int whence )

{

// check fd int status = check_fd( fd ) ;

if( status < 0 ) return status ; FileDescriptor file = process.openFiles[fd] ; int newOffset ; if( whence == 0 ) newOffset = offset ; else if( whence == 1 ) newOffset = file.getOffset() + offset ; else if ( whence == 2 ) newOffset = file.getSize() + offset ; else { // bad whence value process.errno = EINVAL ; return -1 ; } if( newOffset < 0 ) { // bad offset value process.errno = EINVAL ; return -1 ; } file.setOffset( newOffset ) ; return newOffset ; } /* * Open flags. */ /** * Open with read-only access. */ public static final int O_RDONLY = 0 ; /** * Open with write-only access. */ public static final int O_WRONLY = 1 ; /** * Open for read or write access. */ public static final int O_RDWR = 2 ; /** * Opens a file or directory for reading, writing, or * both reading and writing. *

* The file is positioned at the beginning (byte 0). * The returned file descriptor must be used for subsequent * calls for other input and output functions on the file. *

* Simulates the unix system call: *

   *   int open(const char *pathname, int flags );   *

* @param pathname the name of the file or directory to create * @param flags the flags to use when opening the file: O_RDONLY, * O_WRONLY, or O_RDWR. * @return the file descriptor (a non-negative integer); -1 if * the file does not exist, if one of the necessary directories * does not exist or is unreadable, if the file is not readable * (resp. writable), or if too many files are open. * @exception java.lang.Exception if any underlying action causes * an exception to be thrown */

public static int open( String pathname , int flags )

throws Exception

{

// get the full path name String fullPath = getFullPath( pathname ) ;

IndexNode indexNode = new IndexNode() ;

short indexNodeNumber = findIndexNode( fullPath , indexNode ) ;

if( indexNodeNumber < 0 ) return -1 ; // ??? return (Exxx) if the file is not readable // and was opened O_RDONLY or O_RDWR // ??? return (Exxx) if the file is not writable // and was opened O_WRONLY or O_RDWR // set up the file descriptor FileDescriptor fileDescriptor = new FileDescriptor( openFileSystems[ ROOT_FILE_SYSTEM ] , indexNode , flags ) ; fileDescriptor.setIndexNodeNumber( indexNodeNumber ) ; return open( fileDescriptor ) ; } /** * Open a file using a FileDescriptor. The open and create * methods build a file descriptor and then invoke this method * to complete the open process. *

* This is a convenience method for the simulator kernel. * @param fileDescriptor the file descriptor * @return the file descriptor index in the process open file * list assigned to this open file */

private static int open( FileDescriptor fileDescriptor )

{

// scan the kernel open file list for a slot // and add our new file descriptor int kfd = -1 ;

for( int i = 0 ; i < MAX_OPEN_FILES ; i ++ ) if( openFiles[i] == null ) { kfd = i ; openFiles[kfd] = fileDescriptor ; break ; } if( kfd == -1 ) { // return (ENFILE) if there are already too many open files process.errno = ENFILE ; return -1 ; } // scan the list of open files for a slot // and add our new file descriptor int fd = -1 ; for( int i = 0 ; i < ProcessContext.MAX_OPEN_FILES ; i ++ ) if( process.openFiles[i] == null ) { fd = i ; process.openFiles[fd] = fileDescriptor ; break ; } if( fd == -1 ) { // remove the file from the kernel list openFiles[kfd] = null ; // return (EMFILE) if there isn't room left process.errno = EMFILE ; return -1 ; } // return the index of the file descriptor for now open file return fd ; } /** * Read bytes from a file. *

* Simulates the unix system call: *

   *   int read(int fd, void *buf, size_t count);   *

* @param fd the file descriptor of a file open for reading * @param buf an array of bytes into which bytes are read * @param count the number of bytes to read from the file * @return the number of bytes actually read; or -1 if an error occurs. * @exception java.lang.Exception if any underlying action causes * Exception to be thrown */

public static int read( int fd , byte[] buf , int count )

throws Exception

{

// check fd int status = check_fd_for_read( fd ) ;

if( status < 0 ) return status ; FileDescriptor file = process.openFiles[fd] ; int offset = file.getOffset() ; int size = file.getSize() ; int blockSize = file.getBlockSize() ; byte[] bytes = file.getBytes() ; int readCount = 0 ; for( int i = 0 ; i < count ; i ++ ) { // if we read to the end of the file, stop reading if( offset >= size )

break ;

// if this is the first time through the loop, // or if we’re at the beginning of a block, load the data block if( ( i == 0 ) || ( ( offset % blockSize ) == 0 ) )

{

status = file.readBlock( (short)( offset / blockSize ) ) ;

if( status < 0 ) return status ; } // copy a byte from the file buffer to the read buffer buf[i] = bytes[ offset % blockSize ] ; offset ++ ; readCount ++ ; } // update the offset file.setOffset( offset ) ; // return the count of bytes read return readCount ; } /** * Reads a directory entry from a file descriptor for an open directory. *

* Simulates the unix system call: *

   *   int readdir(unsigned int fd, struct dirent *dirp ) ;   *

* Note that count is ignored in the unix call. * @param fd the file descriptor for the directory being read * @param dirp the directory entry into which data should be copied * @return number of bytes read; 0 if end of directory; -1 if the file * descriptor is invalid, or if the file is not opened for read access. * @exception java.lang.Exception if any underlying action causes * Exception to be thrown */

public static int readdir( int fd , DirectoryEntry dirp ) throws Exception

{

// check fd int status = check_fd_for_read( fd ) ;

return 0 ;

// read a block, if needed status = file.readBlock( (short)( file.getOffset() / file.getBlockSize() ) ) ;

if( status < 0 ) return status ; // read bytes from the block into the DirectoryEntry dirp.read( file.getBytes() , file.getOffset() % file.getBlockSize() ) ; file.setOffset( file.getOffset() + DirectoryEntry.DIRECTORY_ENTRY_SIZE ) ; // return the size of a DirectoryEntry return DirectoryEntry.DIRECTORY_ENTRY_SIZE ; } /** * Obtain information for an open file. *

* Simulates the unix system call: *

   *   int fstat(int filedes, struct stat *buf);   *

* @exception java.lang.Exception if any underlying action causes * Exception to be thrown */

public static int fstat( int fd , Stat buf )

throws Exception

{

// check fd int status = check_fd( fd ) ;

if( status < 0 ) return status ; FileDescriptor fileDescriptor = process.openFiles[fd] ; short deviceNumber = fileDescriptor.getDeviceNumber() ; short indexNodeNumber = fileDescriptor.getIndexNodeNumber() ; IndexNode indexNode = fileDescriptor.getIndexNode() ; // copy information to buf buf.st_dev = deviceNumber ; buf.st_ino = indexNodeNumber ; buf.copyIndexNode( indexNode ) ; return 0 ; } /** * Obtain information about a named file. *

* Simulates the unix system call: *

   *   int stat(const char *name, struct stat *buf);   *

* @exception java.lang.Exception if any underlying action causes * Exception to be thrown */

public static int stat( String name , Stat buf )

throws Exception

{

// a buffer for reading directory entries DirectoryEntry directoryEntry = new DirectoryEntry() ;

// get the full path String path = getFullPath( name ) ;

// find the index node IndexNode indexNode = new IndexNode() ;

short indexNodeNumber = findIndexNode( path , indexNode ) ; if( indexNodeNumber < 0 ) { // return ENOENT process.errno = ENOENT ; return -1 ; } // copy information to buf buf.st_dev = ROOT_FILE_SYSTEM ; buf.st_ino = indexNodeNumber ; buf.copyIndexNode( indexNode ) ; return 0 ; } /** * First commits inodes to buffers, and then buffers to disk. *

* Simulates unix system call: *

   *   int sync(void);   *

public static void sync()

{

// write out superblock if updated // write out free list blocks if updated // write out inode blocks if updated // write out data blocks if updated

// at present, all changes to inodes, data blocks, // and free list blocks // are written as they go, so this method does nothing. }

/** * Write bytes to a file. *

* Simulates the unix system call: *

   *   int write(int fd, const void *buf, size_t count);   *

* @exception java.lang.Exception if any underlying action causes * Exception to be thrown */

public static int write( int fd , byte[] buf , int count )

throws Exception

{

// check fd int status = check_fd_for_write( fd ) ;

if( status < 0 ) return status ; FileDescriptor file = process.openFiles[fd] ; // return (ENOSPC) if the device containing the file system // referred to by fd has not room for the data int offset = file.getOffset() ; int size = file.getSize() ; int blockSize = file.getBlockSize() ; byte[] bytes = file.getBytes() ; int writeCount = 0 ; for( int i = 0 ; i < count ; i ++ ) { // if this is the first time through the loop, // or if we're at the beginning of a block, // load or allocate a data block if( ( i == 0 ) || ( ( offset % blockSize ) == 0 ) ) { status = file.readBlock( (short)( offset / blockSize ) ) ; if( status < 0 ) return status ; } // copy a byte from the write buffer to the file buffer bytes[ offset % blockSize ] = buf[i] ; offset ++ ; // if we get to the end of a block, write it out if( ( offset % blockSize ) == 0 ) { status = file.writeBlock( (short)( ( offset - 1 ) / blockSize ) ) ; if( status < 0 ) return status ; // update the file size if it grew if( offset > size )

{

file.setSize( offset ) ;

size = offset ;

}

writeCount ++ ;

}

// write the last block if we wrote anything to it if( ( offset % blockSize ) > 0 )

{

status = file.writeBlock( (short)( ( offset – 1 ) / blockSize ) ) ;

if( status < 0 ) return status ; } // update the file size if it grew if( offset > size )

file.setSize( offset ) ;

// update the offset file.setOffset( offset ) ;

// return the count of bytes written return writeCount ;

}

/** * Writes a directory entry from a file descriptor for an * open directory. *

* Simulates the unix system call: *

   *   int readdir(unsigned int fd, struct dirent *dirp ) ;   *

public static int writedir( int fd , DirectoryEntry dirp ) throws Exception

{

// check fd int status = check_fd_for_write( fd ) ;

if( status < 0 ) return status ; FileDescriptor file = process.openFiles[fd] ; // check to see if the file is a directory if( ( file.getMode() & S_IFMT ) != S_IFDIR ) { // return (ENOTDIR) if a needed directory is not a directory process.errno = ENOTDIR ; return -1 ; } short blockSize = file.getBlockSize() ; // allocate or read a block status = file.readBlock( (short)( file.getOffset() / blockSize ) ) ; if( status < 0 ) return status ; // write bytes from the DirectoryEntry into the block dirp.write( file.getBytes() , file.getOffset() % blockSize ) ; // write the updated block status = file.writeBlock( (short)( file.getOffset() / blockSize ) ) ; if( status < 0 ) return status ; // update the file size file.setOffset( file.getOffset() + DirectoryEntry.DIRECTORY_ENTRY_SIZE ) ; if( file.getOffset() > file.getSize() )

file.setSize( file.getOffset() ) ;

// return the size of a DirectoryEntry return DirectoryEntry.DIRECTORY_ENTRY_SIZE ;

}

/*to be done: int access(const char *pathname, int mode); int link(const char *oldpath, const char *newpath); int unlink(const char *pathname); int rename(const char *oldpath, const char *newpath); int symlink(const char *oldpath, const char *newpath); int lstat(const char *file_name, struct stat *buf); int chmod(const char *path, mode_t mode); int fchmod(int fildes, mode_t mode); int chown(const char *path, uid_t owner, gid_t group); int fchown(int fd, uid_t owner, gid_t group); int lchown(const char *path, uid_t owner, gid_t group); int utime(const char *filename, struct utimbuf *buf); int readlink(const char *path, char *buf, size_t bufsiz); int chdir(const char *path); mode_t umask(mode_t mask);*/

/** * This is an internal variable for the simulator which always * points to the * current ProcessContext. If multiple processes are implemented, * then this variable will “point” to different processes at * different times. */

private static ProcessContext process = null ;

/** * The number of processes. */

private static int processCount = 0 ;

private static int MAX_OPEN_FILES = 0 ;

private static FileDescriptor[] openFiles = null ;

// ??? should be private? public static int MAX_OPEN_FILE_SYSTEMS = 1 ;

// ??? should be private? public static FileSystem[] openFileSystems = new FileSystem[MAX_OPEN_FILE_SYSTEMS] ;

// ??? should be private? public static short ROOT_FILE_SYSTEM = 0 ;

/** * Initialize the file simulator kernel. This should be the * first call in any simulation program. You can think of this * as the method which “boots” the kernel. * This method opens the “filesys.conf” file (or the file named * by the system property “filesys.conf”) and reads any properties * given in that file, including the filesystem.root.filename and * filesystem.root.mode (“r”, “rw”). */

public static void initialize()

{

// check to see if the name of an alternate configuration // file has been specified. This can be done, for example, // java -Dfilesys.conf=myfile.txt program-name parameter … String propertyFileName = System.getProperty( “filesys.conf” ) ;

if ( propertyFileName == null )

propertyFileName = “filesys.conf” ;

Properties properties = new Properties() ;

try

{

FileInputStream in = new FileInputStream( propertyFileName ) ;

properties.load( in ) ; in.close() ;

}

catch( FileNotFoundException e )

{

System.err.println( PROGRAM_NAME + “: error opening properties file” ) ;

System.exit( EXIT_FAILURE ) ;

}

catch( IOException e )

{

System.err.println( PROGRAM_NAME + “: error reading properties file” ) ;

System.exit( EXIT_FAILURE ) ;

}

// get the root file system properties String rootFileSystemFilename = properties.getProperty( “filesystem.root.filename” , “filesys.dat” ) ;

String rootFileSystemMode = properties.getProperty( “filesystem.root.mode” , “rw” ) ;

// get the current process properties short uid = 1 ;

try

{

uid = Short.parseShort( properties.getProperty( “process.uid” , “1” ) ) ;

}

catch ( NumberFormatException e )

{

System.err.println( PROGRAM_NAME + “: invalid number for property process.uid in configuration file” ) ;

System.exit( EXIT_FAILURE ) ;

}

short gid = 1 ;

try

{

gid = Short.parseShort( properties.getProperty( “process.gid” , “1” ) ) ;

}

catch ( NumberFormatException e )

{

System.err.println( PROGRAM_NAME + “: invalid number for property process.gid in configuration file” ) ;

System.exit( EXIT_FAILURE ) ;

}

short umask = 0002 ;

try

{

umask = Short.parseShort( properties.getProperty( “process.umask” , “002” ) , 8 ) ;

}

catch ( NumberFormatException e )

{

System.err.println( PROGRAM_NAME +

“: invalid number for property process.umask in configuration file” ) ;

System.exit( EXIT_FAILURE ) ;

}

String dir = “/root” ;

dir = properties.getProperty( “process.dir” , “/root” ) ;

try

{

MAX_OPEN_FILES = Integer.parseInt( properties.getProperty(

“kernel.max_open_files” , “20” ) ) ;

}

catch( NumberFormatException e )

{

System.err.println( PROGRAM_NAME + “: invalid number for property kernel.max_open_files in configuration file” ) ;

System.exit( EXIT_FAILURE );

}

try

{

ProcessContext.MAX_OPEN_FILES = Integer.parseInt( properties.getProperty( “process.max_open_files” , “10” ) ) ;

}

catch( NumberFormatException e )

{

System.err.println( PROGRAM_NAME + “: invalid number for property process.max_open_files in configuration file” ) ;

System.exit( EXIT_FAILURE );

}

// create open file array openFiles = new FileDescriptor[MAX_OPEN_FILES] ;

// create the first process process = new ProcessContext( uid , gid , dir , umask ) ;

processCount ++ ;

// open the root file system try

{

openFileSystems[ROOT_FILE_SYSTEM] = new FileSystem( rootFileSystemFilename , rootFileSystemMode ) ;

}

catch( IOException e )

{

System.err.println( PROGRAM_NAME + “: unable to open root file system” ) ;

System.exit( EXIT_FAILURE ) ;

}

/** * Failure exit status. */

private static int EXIT_FAILURE = 1 ;

/** * Success exit status. */

private static int EXIT_SUCCESS = 0 ;

/** * End the simulation and exit. * Terminates any remaining “processes”, flushes all file system blocks * to “disk”, and exit the simulation program. This method is generally * called by exit() when the last process terminates. However, * it may also be called directly to gracefully end the simlation. * @param status the status to use with System.exit() * @exception java.lang.Exception if any underlying operation * causes and exception to be thrown. */

public static void finalize( int status )

throws Exception

{

// exit() any remaining processes if( process != null )

exit( 0 ) ;

// flush file system blocks sync() ;

// close the root file system openFileSystems[ROOT_FILE_SYSTEM].close() ;

// terminate the program System.exit( status ) ;

}

/*Some internal methods.*/

/** * Check to see if the integer given is a valid file descriptor * index for the current process. Sets errno to EBADF if invalid. *

* This is a convenience method for the simulator kernel; * it should not be called by user programs. * @param fd the file descriptor index * @return zero if the file descriptor index is valid; -1 if the file * descriptor index is not valid */

private static int check_fd( int fd )

{

// look for the file descriptor in the open file list if ( fd < 0 || fd >= process.openFiles.length || process.openFiles[fd] == null )

{

// return (EBADF) if file descriptor is invalid process.errno = EBADF ;

return -1 ;

}

return 0 ;

}

/** * Check to see if the integer given is a valid file descriptor * index for the current process, and if so, whether the file is * open for reading. Sets errno to EBADF if invalid or not open * for reading. *

private static int check_fd_for_read( int fd )

{

int status = check_fd( fd ) ;

if( status < 0 ) return -1 ; FileDescriptor fileDescriptor = process.openFiles[fd] ; int flags = fileDescriptor.getFlags() ; if( ( flags != O_RDONLY ) && ( flags != O_RDWR ) ) { // return (EBADF) if the file is not open for reading process.errno = EBADF ; return -1 ; } return 0 ; } /** * Check to see if the integer given is a valid file descriptor * index for the current process, and if so, whether the file is * open for writing. Sets errno to EBADF if invalid or not open * for writing. *

private static int check_fd_for_write( int fd )

{

int status = check_fd( fd ) ;

if( status < 0 ) return -1 ; FileDescriptor fileDescriptor = process.openFiles[fd] ; int flags = fileDescriptor.getFlags() ; if( ( flags != O_WRONLY ) && ( flags != O_RDWR ) ) { // return (EBADF) if the file is not open for writing process.errno = EBADF ; return -1 ; } return 0 ; } /** * Get the full path for a file by adding * the working directory for the current process * to the beginning of the given path name * if necessary. * @param pathname the given path name * @return the resulting fully qualified path name */ private static String getFullPath( String pathname ) { String fullPath = null ; // make sure the path starts with a slash if( pathname.startsWith( "/" ) ) fullPath = pathname ; else fullPath = process.getDir() + "/" + pathname ; return fullPath ; } private static IndexNode rootIndexNode = null ; private static IndexNode getRootIndexNode() { if( rootIndexNode == null ) rootIndexNode = openFileSystems[ROOT_FILE_SYSTEM].getRootIndexNode() ; return rootIndexNode ; } private static short findNextIndexNode( FileSystem fileSystem , IndexNode indexNode , String name , IndexNode nextIndexNode ) throws Exception { // if stat isn't a directory give an error if( ( indexNode.getMode() & S_IFMT ) != S_IFDIR ) { // return (ENOTDIR) if a needed directory is not a directory process.errno = ENOTDIR ; return -1 ; } // if user isn't alowed to read directory, give an error // ??? tbd // return (EACCES) if a needed directory is not readable FileDescriptor fileDescriptor = new FileDescriptor( fileSystem , indexNode , O_RDONLY ) ; int fd = open( fileDescriptor ) ; if( fd < 0 ) { // process.errno = ??? return -1 ; } // create a buffer for reading directory entries DirectoryEntry directoryEntry = new DirectoryEntry() ; int status = 0 ; short indexNodeNumber = -1 ; // while there are more directory blocks to be read while( true ) { // read a directory entry status = readdir( fd , directoryEntry ) ; if( status <= 0 ) { // we got to the end of the directory, or // encountered an error, so quit break ; } if( directoryEntry.getName().equals( name ) ) { indexNodeNumber = directoryEntry.getIno() ; // read the inode block fileSystem.readIndexNode( nextIndexNode , indexNodeNumber ) ; // we're done searching break ; } } // close the file since we're done with it int close_status = close( fd ) ; if( close_status < 0 ) { // process.errno = ??? return -1 ; } // if we encountered an error reading, return error if( status < 0 ) { // process.errno = ??? return -1 ; } // if we got to the directory without finding the name, return error if( status == 0 ) { process.errno = ENOENT ; return -1 ; } // return index node number if success return indexNodeNumber ; } // get the inode for a file which is expected to exist private static short findIndexNode( String path , IndexNode inode ) throws Exception { // start with the root file system, root inode FileSystem fileSystem = openFileSystems[ ROOT_FILE_SYSTEM ] ; IndexNode indexNode = getRootIndexNode( ) ; short indexNodeNumber = FileSystem.ROOT_INDEX_NODE_NUMBER ; // parse the path until we get to the end StringTokenizer st = new StringTokenizer( path , "/" ) ; while( st.hasMoreTokens() ) { String s = st.nextToken() ; if ( ! s.equals("") ) { // check to see if it is a directory if( ( indexNode.getMode() & S_IFMT ) != S_IFDIR ) { // return (ENOTDIR) if a needed directory is not a directory process.errno = ENOTDIR ; return -1 ; } // check to see if it is readable by the user // ??? tbd // return (EACCES) if a needed directory is not readable IndexNode nextIndexNode = new IndexNode() ; // get the next index node corresponding to the token indexNodeNumber = findNextIndexNode( fileSystem , indexNode , s , nextIndexNode ) ; if( indexNodeNumber < 0 ) { // return ENOENT process.errno = ENOENT ; return -1 ; } indexNode = nextIndexNode ; } } // copy indexNode to inode indexNode.copy( inode ) ; return indexNodeNumber ; } } __MACOSX/filesys/._Kernel.java filesys/ls.class public synchronized class ls { public static String PROGRAM_NAME; public void ls(); public static void main(String[]) throws Exception; private static void print(String, Stat); static void ();
}

filesys/ls.java

/**

* A simple directory listing program for a simulated file system.

* Usage:


 *   java ls path-name ...

 *

public

class
ls

{

/**

* The name of this program.

* This is the program name that is used

* when displaying error messages.

public

static

String
PROGRAM_NAME
=

“ls”

;

/**

* Lists information about named files or directories.

*
@exception
java.lang.Exception if an exception is thrown

* by an underlying operation

public

static

void
main
(

String
[]
args
)

throws

Exception

{

// initialize the file system simulator kernel

Kernel
.
initialize
()

;

// for each path-name given

for
(

int
i
=

0

;
i
< args . length ; i ++ ) { String name = args [ i ] ; int status = 0 ; // stat the name to get information about the file or directory Stat stat = new Stat () ; status = Kernel . stat ( name , stat ) ; if ( status < 0 ) { Kernel . perror ( PROGRAM_NAME ) ; Kernel . exit ( 1 ) ; } // mask the file type from the mode short type = ( short )( stat . getMode () & Kernel . S_IFMT ) ; // if name is a regular file, print the info if ( type == Kernel . S_IFREG ) { print ( name , stat ) ; } // if name is a directory open it and read the contents else if ( type == Kernel . S_IFDIR ) { // open the directory int fd = Kernel . open ( name , Kernel . O_RDONLY ) ; if ( fd < 0 ) { Kernel . perror ( PROGRAM_NAME ) ; System . err . println ( PROGRAM_NAME + ": unable to open \"" + name + "\" for reading" ) ; Kernel . exit ( 1 ) ; } // print a heading for this directory System . out . println () ; System . out . println ( name + ":" ) ; // create a directory entry structure to hold data as we read DirectoryEntry directoryEntry = new DirectoryEntry () ; int count = 0 ; // while we can read, print the information on each entry while ( true ) { // read an entry; quit loop if error or nothing read status = Kernel . readdir ( fd , directoryEntry ) ; if ( status <= 0 ) break ; // get the name from the entry String entryName = directoryEntry . getName () ; // call stat() to get info about the file status = Kernel . stat ( name + "/" + entryName , stat ) ; if ( status < 0 ) { Kernel . perror ( PROGRAM_NAME ) ; Kernel . exit ( 1 ) ; } // print the entry information print ( entryName , stat ) ; count ++ ; } // check to see if our last read failed if ( status < 0 ) { Kernel . perror ( "main" ) ; System . err . println ( "main: unable to read directory entry from /" ) ; Kernel . exit ( 2 ) ; } // close the directory Kernel . close ( fd ) ; // print a footing for this directory System . out . println ( "total files: " + count ) ; } } // exit with success if we process all the arguments Kernel . exit ( 0 ) ; } /** * Print a listing for a particular file. * This is a convenience method. * @param name the name to print * @param stat the stat containing the file's information */ private static void print ( String name , Stat stat ) { // a buffer to fill with a line of output StringBuffer s = new StringBuffer () ; // a temporary string String t = null ; // append the inode number in a field of 5 t = Integer . toString ( stat . getIno () ) ; for ( int i = 0 ; i < 5 - t . length () ; i ++ ) s . append ( ' ' ) ; s . append ( t ) ; s . append ( ' ' ) ; // append the size in a field of 10 t = Integer . toString ( stat . getSize () ) ; for ( int i = 0 ; i < 10 - t . length () ; i ++ ) s . append ( ' ' ) ; s . append ( t ) ; s . append ( ' ' ) ; // append the name s . append ( name ) ; // print the buffer System . out . println ( s . toString () ) ; } } __MACOSX/filesys/._ls.java filesys/mkdir.class public synchronized class mkdir { public static final String PROGRAM_NAME = mkdir; public void mkdir(); public static void main(String[]) throws Exception; } filesys/mkdir.java filesys/mkdir.java /** * An mkdir for a simulated file system. *

* Usage:


 *   java mkdir directory-name ...

 *

public

class
mkdir

{

/**

* The name of this program.

* This is the program name that is used

* when displaying error messages.

public

static

final

String
PROGRAM_NAME
=

“mkdir”

;

/**

* Creates the directories given as command line arguments.

*
@exception
java.lang.Exception if an exception is thrown

* by an underlying operation

public

static

void
main
(

String
[]
args
)

throws

Exception

{

// initialize the file system simulator kernel

Kernel
.
initialize
()

;

// print a helpful message if no command line arguments are given

if
(
args
.
length
< 1 ) { System . err . println ( PROGRAM_NAME + ": too few arguments" ) ; Kernel . exit ( 1 ) ; } // create a buffer for writing directory entries byte [] directoryEntryBuffer = new byte [ DirectoryEntry . DIRECTORY_ENTRY_SIZE ] ; // for each argument given on the command line for ( int i = 0 ; i < args . length ; i ++ ) { // given the argument a better name String name = args [ i ] ; int status = 0 ; // call creat() to create the file int newDir = Kernel . creat ( name , Kernel . S_IFDIR ) ; if ( newDir < 0 ) { Kernel . perror ( PROGRAM_NAME ) ; System . err . println ( PROGRAM_NAME + ": \"" + name + "\"" ) ; Kernel . exit ( 2 ) ; } // get file info for "." Stat selfStat = new Stat () ; status = Kernel . fstat ( newDir , selfStat ) ; if ( status < 0 ) { Kernel . perror ( PROGRAM_NAME ) ; Kernel . exit ( 3 ) ; } // add entry for "." DirectoryEntry self = new DirectoryEntry ( selfStat . getIno () , "." ) ; self . write ( directoryEntryBuffer , 0 ) ; status = Kernel . write ( newDir , directoryEntryBuffer , directoryEntryBuffer . length ) ; if ( status < 0 ) { Kernel . perror ( PROGRAM_NAME ) ; Kernel . exit ( 4 ) ; } // get file info for ".." Stat parentStat = new Stat () ; Kernel . stat ( name + "/.." , parentStat ) ; // add entry for ".." DirectoryEntry parent = new DirectoryEntry ( parentStat . getIno () , ".." ) ; parent . write ( directoryEntryBuffer , 0 ) ; status = Kernel . write ( newDir , directoryEntryBuffer , directoryEntryBuffer . length ) ; if ( status < 0 ) { Kernel . perror ( PROGRAM_NAME ) ; Kernel . exit ( 5 ) ; } // call close() to close the file status = Kernel . close ( newDir ) ; if ( status < 0 ) { Kernel . perror ( PROGRAM_NAME ) ; Kernel . exit ( 6 ) ; } } // exit with success if we process all the arguments Kernel . exit ( 0 ) ; } } __MACOSX/filesys/._mkdir.java filesys/mkfs.class public synchronized class mkfs { public void mkfs(); public static void main(String[]) throws Exception; } filesys/mkfs.java filesys/mkfs.java import java . io . * ; /** * A MOSS File System Simulator program which uses the simulator * classes to create a "file system". */ public class mkfs { /** * Creates a "file system" in the named file with the specified * blocksize and number of blocks. * @exception java.lang.Exception if any exception occurs */ public static void main ( String [] argv ) throws Exception { if ( argv . length != 3 ) { System . err . println ( "mkfs: usage: java mkfs ”

)

;

System
.
exit
(

1

)

;

}

String
filename
=
argv
[
0
]

;

short
block_size
=

Short
.
parseShort
(
argv
[
1
]

)

;

int
blocks
=

Integer
.
parseInt
(
argv
[
2
]

)

;

int
block_total
=

0

;

blocks =

super_blocks +

free_list_blocks +

inode_blocks +

data_blocks

We need one block for the superblock.

super_blocks = 1

We need one bit in the free list map for each data block.

free_list_blocks =

( data_blocks + block_size * 8 – 1 ) /

( block_size * 8 )

??? Is this the correct number of inodes?

At worse, there will be only directory entries and empty files.

In other words, we might need as many inodes as we have blocks.

inode_blocks =

( data_blocks + block_size / inode_size – 1 ) /

( block_size / inode_size )

Then:

blocks =

super_blocks +

( data_blocks + block_size * 8 – 1 ) /

( block_size * 8 ) +

( data_blocks + block_size / inode_size – 1 ) /

( block_size / inode_size ) +

data_blocks

We then seek the maximum number of data blocks where the total number

of blocks is less than or equal to the number of blocks available.

We use a binary searching technique in the following algorithm.

int
inode_size
=

IndexNode
.
INDEX_NODE_SIZE
;

int
super_blocks
=

1

;

int
free_list_blocks
=

0

;

int
inode_blocks
=

0

;

int
data_blocks
=

0

;

int
lo
=

0

;

int
hi
=
blocks
;

while
(
lo
<= hi ) { data_blocks = ( lo + hi + 1 ) / 2 ; free_list_blocks = ( data_blocks + block_size * 8 - 1 ) / ( block_size * 8 ) ; inode_blocks = ( data_blocks + block_size / inode_size - 1 ) / ( block_size / inode_size ) ; block_total = super_blocks + free_list_blocks + inode_blocks + data_blocks ; /* Just in case you want to see it converge... System.out.println( "lo: " + lo + " hi: " + hi ) ; System.out.println( "block_size: " + block_size ) ; System.out.println( "blocks: " + blocks ) ; System.out.println( "free_list_blocks: " + free_list_blocks ) ; System.out.println( "inode_blocks: " + inode_blocks ) ; System.out.println( "data_blocks: " + data_blocks ) ; System.out.println( "block_total: " + block_total ) ; System.out.println() ; */ if ( block_total < blocks ) lo = data_blocks + 1 ; else if ( block_total >
blocks
)

hi
=
data_blocks
–

1

;

else

break

;

}

// if the last block causes free_list_blocks or inode_blocks to

// cross a block boundary, we “give” the extra space to the free

// list and/or inodes and use whatever remains for the data blocks

if
(
block_total
>
blocks
)

{

// System.out.println( “adjusting data blocks…” ) ;

// System.out.println() ;

data_blocks
—

;

}

// calculate inode and free list blocks based on the final

// count of data blocks

free_list_blocks
=

(
data_blocks
+
block_size
*

8

–

1

)

/

(
block_size
*

8

)

;

inode_blocks
=
blocks
–
super_blocks
–
free_list_blocks
–
data_blocks
;

block_total
=
super_blocks
+
free_list_blocks
+

inode_blocks
+
data_blocks
;

if

(
data_blocks
<= 0 ) { System . err . println ( "mkfs: parameters resulted in data block count less than one" ) ; System . exit ( 2 ) ; } System . out . println ( "block_size: " + block_size ) ; System . out . println ( "blocks: " + blocks ) ; System . out . println ( "super_blocks: " + super_blocks ) ; System . out . println ( "free_list_blocks: " + free_list_blocks ) ; System . out . println ( "inode_blocks: " + inode_blocks ) ; System . out . println ( "data_blocks: " + data_blocks ) ; System . out . println ( "block_total: " + block_total ) ; // If the file already exists, we delete it. // Under jdk 1.2 we can use setLength() to truncate, but // for now we must delete and re-create. File deleteFile = new File ( filename ) ; deleteFile . delete () ; RandomAccessFile file = new RandomAccessFile ( filename , "rw" ) ; int superBlockOffset = 0 ; int freeListBlockOffset = superBlockOffset + 1 ; int inodeBlockOffset = freeListBlockOffset + free_list_blocks ; int dataBlockOffset = inodeBlockOffset + inode_blocks ; /* System.out.println( "free list block offset: " + freeListBlockOffset ) ; System.out.println( "inode block offset: " + inodeBlockOffset ) ; System.out.println( "data block offset: " + dataBlockOffset ) ; */ // create the superblock SuperBlock superBlock = new SuperBlock () ; superBlock . setBlockSize ( block_size ) ; superBlock . setBlocks ( blocks ) ; superBlock . setFreeListBlockOffset ( freeListBlockOffset ) ; superBlock . setInodeBlockOffset ( inodeBlockOffset ) ; superBlock . setDataBlockOffset ( dataBlockOffset ) ; // write the superblock file . seek ( superBlockOffset * block_size ) ; superBlock . write ( file ) ; // create the free list bitmap block BitBlock freeListBlock = new BitBlock ( block_size ) ; // all blocks are free except the first block, which contains // the directory block for the root directory. freeListBlock . setBit ( 0 ) ; // write the free list bitmap blocks file . seek ( freeListBlockOffset * block_size ) ; freeListBlock . write ( file ) ; // write the rest of the free list blocks which should be empty BitBlock emptyFreeListBlock = new BitBlock ( block_size ) ; for ( int i = freeListBlockOffset + 1 ; i < inodeBlockOffset ; i ++ ) { file . seek ( i * block_size ) ; emptyFreeListBlock . write ( file ) ; } // create the root inode block Block rootInodeBlock = new Block ( block_size ) ; // create the inode for the root directory IndexNode rootIndexNode = new IndexNode () ; // set the first block address to the the // address of the first available data block. rootIndexNode . setBlockAddress ( 0 , 0 ) ; // the root inode is a directory inode rootIndexNode . setMode ( Kernel . S_IFDIR ) ; // there are two directory entries in the root file system, // so we set the file size accordingly. rootIndexNode . setSize ( DirectoryEntry . DIRECTORY_ENTRY_SIZE * 2 ) ; // set the link count: itself, dot, dot-dot rootIndexNode . setNlink ( ( short ) 3 ) ; // write the rootIndexNode to the rootInodeBlock rootIndexNode . write ( rootInodeBlock . bytes , ( FileSystem . ROOT_INDEX_NODE_NUMBER * IndexNode . INDEX_NODE_SIZE ) % block_size ) ; // ??? write the rest of the inodes in the first block // write the first inode block file . seek ( inodeBlockOffset * block_size + FileSystem . ROOT_INDEX_NODE_NUMBER * IndexNode . INDEX_NODE_SIZE ) ; rootInodeBlock . write ( file ) ; // ??? write the rest of the inode blocks // create the root directory block Block rootDirectoryBlock = new Block ( block_size ) ; // the root directory block contains two directory entries: // one for itself ("."), and one for its parent (".."). // Both of these reference the root inode. DirectoryEntry itself = new DirectoryEntry ( FileSystem . ROOT_INDEX_NODE_NUMBER , "." ) ; DirectoryEntry parent = new DirectoryEntry ( FileSystem . ROOT_INDEX_NODE_NUMBER , ".." ) ; // write the root directory entries to the root directory block itself . write ( rootDirectoryBlock . bytes , 0 ) ; parent . write ( rootDirectoryBlock . bytes , DirectoryEntry . DIRECTORY_ENTRY_SIZE ) ; // write the root directory block to the file file . seek ( dataBlockOffset * block_size ) ; rootDirectoryBlock . write ( file ) ; // write a zero byte to the last byte of the file system file file . seek ( blocks * block_size - 1 ) ; file . write ( 0 ) ; file . close () ; } } __MACOSX/filesys/._mkfs.java filesys/ProcessContext.class public synchronized class ProcessContext { public int errno; private short uid; private short gid; private String dir; private short umask; public static int MAX_OPEN_FILES; public FileDescriptor[] openFiles; public void ProcessContext(); public void ProcessContext(short, short, String, short); public void setUid(short); public short getUid(); public void setGid(short); public short getGid(); public void setDir(String); public String getDir(); public void setUmask(short); public short getUmask(); static void ();
}

filesys/ProcessContext.java

/**

* A process context. This contains all information needed

* by the file system which is specific to a process.

public

class

ProcessContext

{

/**

* Number of last error.

* Simulates the unix system variable:


   *   extern int errno;

   *

public

int
errno
=

0

;

/**

* The uid for the process.

private

short
uid
=

1

;

/**

* The gid for the process.

private

short
gid
=

1

;

/**

* The working directory for the process.

private

String
dir
=

“/root”

;

/**

* The umask for the process.

private

short
umask
=

0000

;

/**

* The maximum number of files a process may have open.

public

static

int
MAX_OPEN_FILES
=

0

;

/**

* The array of file descriptors for open files.

* The integer file descriptors for kernel method calls

* are indexes into this array.

public

FileDescriptor
[]
openFiles
=

new

FileDescriptor
[
MAX_OPEN_FILES
]

;

/**

* Construct a process context. By default, uid=1, gid=1, dir=”/root”,

* and umask=0000.

public

ProcessContext
()

{

super
()

;

}

/**

* Construct a process context and specify uid, gid, dir, and umask.

public

ProcessContext
(

short
newUid
,

short
newGid
,

String
newDir
,

short
newUmask
)

{

super
()

;

uid
=
newUid
;

gid
=
newGid
;

dir
=
newDir
;

umask
=
newUmask
;

}

/**

* Set the process uid.

public

void
setUid
(

short
newUid
)

{

uid
=
newUid
;

}

/**

* Get the process uid.

public

short
getUid
()

{

return
uid
;

}

/**

* Set the process gid.

public

void
setGid
(

short
newGid
)

{

gid
=
newGid
;

}

/**

* Get the process gid.

public

short
getGid
()

{

return
gid
;

}

/**

* Set the process working directory.

public

void
setDir
(

String
newDir
)

{

dir
=
newDir
;

}

/**

* Get the process working directory.

public

String
getDir
()

{

return
dir
;

}

/**

* Set the process umask.

public

void
setUmask
(

short
newUmask
)

{

umask
=
newUmask
;

}

/**

* Get the process umask.

public

short
getUmask
()

{

return
umask
;

}

// ??? toString()

}

__MACOSX/filesys/._ProcessContext.java

filesys/sample.conf
!
! my personal filesys configuration file
!
filesystem.root.filename = rayo.dat
filesystem.root.mode = r
process.uid = 1000
process.gid = 1000
process.umask = 002
process.dir = /home/rayo

filesys/Stat.class
public synchronized class Stat {
public short st_dev;
public short st_ino;
public short st_mode;
public short st_nlink;
public short st_uid;
public short st_gid;
public short st_rdev;
public int st_size;
public int st_atime;
public int st_mtime;
public int st_ctime;
public void Stat();
public void setDev(short);
public short getDev();
public void setIno(short);
public short getIno();
public void setMode(short);
public short getMode();
public void setNlink(short);
public short getNlink();
public void setUid(short);
public short getUid();
public void setGid(short);
public short getGid();
public void setRdev(short);
public short getRdev();
public void setSize(int);
public int getSize();
public void setAtime(int);
public int getAtime();
public void setMtime(int);
public int getMtime();
public void setCtime(int);
public int getCtime();
public void copyIndexNode(IndexNode);
}

filesys/Stat.java

/**

* This simulates the unix struct “stat”.

public

class

Stat

{

public

short
st_dev
=

0

;

public

short
st_ino
=

0

;

public

short
st_mode
=

0

;

public

short
st_nlink
=

0

;

public

short
st_uid
=

0

;

public

short
st_gid
=

0

;

public

short
st_rdev
=

0

;

public

int
st_size
=

0

;

public

int
st_atime
=

0

;

public

int
st_mtime
=

0

;

public

int
st_ctime
=

0

;

public

void
setDev
(

short
newDev
)

{

st_dev
=
newDev
;

}

public

short
getDev
()

{

return
st_dev
;

}

public

void
setIno
(

short
newIno
)

{

st_ino
=
newIno
;

}

public

short
getIno
()

{

return
st_ino
;

}

public

void
setMode
(

short
newMode
)

{

st_mode
=
newMode
;

}

public

short
getMode
()

{

return
st_mode
;

}

public

void
setNlink
(

short
newNlink
)

{

st_nlink
=
newNlink
;

}

public

short
getNlink
()

{

return
st_nlink
;

}

public

void
setUid
(

short
newUid
)

{

st_uid
=
newUid
;

}

public

short
getUid
()

{

return
st_uid
;

}

public

void
setGid
(

short
newGid
)

{

st_gid
=
newGid
;

}

public

short
getGid
()

{

return
st_gid
;

}

public

void
setRdev
(

short
newRdev
)

{

st_rdev
=
newRdev
;

}

public

short
getRdev
()

{

return
st_rdev
;

}

public

void
setSize
(

int
newSize
)

{

st_size
=
newSize
;

}

public

int
getSize
()

{

return
st_size
;

}

public

void
setAtime
(

int
newAtime
)

{

st_atime
=
newAtime
;

}

public

int
getAtime
()

{

return
st_atime
;

}

public

void
setMtime
(

int
newMtime
)

{

st_mtime
=
newMtime
;

}

public

int
getMtime
()

{

return
st_mtime
;

}

public

void
setCtime
(

int
newCtime
)

{

st_ctime
=
newCtime
;

}

public

int
getCtime
()

{

return
st_ctime
;

}

public

void
copyIndexNode
(

IndexNode
indexNode
)

{

st_mode
=
indexNode
.
getMode
()

;

st_nlink
=
indexNode
.
getNlink
()

;

st_uid
=
indexNode
.
getUid
()

;

st_uid
=
indexNode
.
getGid
()

;

st_size
=
indexNode
.
getSize
()

;

st_atime
=
indexNode
.
getAtime
()

;

st_mtime
=
indexNode
.
getMtime
()

;

st_ctime
=
indexNode
.
getCtime
()

;

}

__MACOSX/filesys/._Stat.java

filesys/SuperBlock.class
public synchronized class SuperBlock {
private short blockSize;
private int blocks;
private int freeListBlockOffset;
private int inodeBlockOffset;
private int dataBlockOffset;
public void SuperBlock();
public void setBlockSize(short);
public short getBlockSize();
public void setBlocks(int);
public int getBlocks();
public void setFreeListBlockOffset(int);
public int getFreeListBlockOffset();
public void setInodeBlockOffset(int);
public int getInodeBlockOffset();
public void setDataBlockOffset(int);
public int getDataBlockOffset();
public void write(java.io.RandomAccessFile) throws java.io.IOException;
public void read(java.io.RandomAccessFile) throws java.io.IOException;
}

filesys/SuperBlock.java

import
java
.
io
.
RandomAccessFile

;

import
java
.
io
.
IOException

;

import
java
.
util
.
*
;

public

class

SuperBlock

{

/**

* Size of each block in the file system.

private

short
blockSize
;

/**

* Total number of blocks in the file system.

private

int
blocks
;

/**

* Offset in blocks of the free list block region from the beginning

* of the file system.

private

int
freeListBlockOffset
;

/**

* Offset in blocks of the inode block region from the beginning

* of the file system.

private

int
inodeBlockOffset
;

/**

* Offset in blocks of the data block region from the beginning

* of the file system.

private

int
dataBlockOffset
;

/**

* Construct a SuperBlock.

public

SuperBlock
()

{

super
();

}

public

void
setBlockSize
(

short
newBlockSize
)

{

blockSize
=
newBlockSize
;

}

public

short
getBlockSize
()

{

return
blockSize
;

}

public

void
setBlocks
(

int
newBlocks
)

{

blocks
=
newBlocks
;

}

public

int
getBlocks
()

{

return
blocks
;

}

/**

* Set the freeListBlockOffset (in blocks)

*
@param
newFreeListBlockOffset the new offset in blocks

public

void
setFreeListBlockOffset
(

int
newFreeListBlockOffset
)

{

freeListBlockOffset
=
newFreeListBlockOffset
;

}

/**

* Get the free list block offset

*
@return
the free list block offset

public

int
getFreeListBlockOffset
()

{

return
freeListBlockOffset
;

}

/**

* Set the inodeBlockOffset (in blocks)

*
@param
newInodeBlockOffset the new offset in blocks

public

void
setInodeBlockOffset
(

int
newInodeBlockOffset
)

{

inodeBlockOffset
=
newInodeBlockOffset
;

}

/**

* Get the inode block offset (in blocks)

*
@return
inode block offset in blocks

public

int
getInodeBlockOffset
()

{

return
inodeBlockOffset
;

}

/**

* Set the dataBlockOffset (in blocks)

*
@param
newDataBlockOffset the new offset in blocks

public

void
setDataBlockOffset
(

int
newDataBlockOffset
)

{

dataBlockOffset
=
newDataBlockOffset
;

}

/**

* Get the dataBlockOffset (in blocks)

*
@return
the offset in blocks to the data block region

public

int
getDataBlockOffset
()

{

return
dataBlockOffset
;

}

/**

* writes this SuperBlock at the current position of the specified file.

public

void
write
(

RandomAccessFile
file
)

throws

IOException

{

file
.
writeShort
(
blockSize
)

;

file
.
writeInt
(
blocks
)

;

file
.
writeInt
(
freeListBlockOffset
)

;

file
.
writeInt
(
inodeBlockOffset
)

;

file
.
writeInt
(
dataBlockOffset
)

;

for
(

int
i
=

0

;
i
< blockSize - 18 ; i ++ ) file . write ( ( byte ) 0 ) ; } /** * reads this SuperBlock at the current position of the specified file. */ public void read ( RandomAccessFile file ) throws IOException { blockSize = file . readShort () ; blocks = file . readInt () ; freeListBlockOffset = file . readInt () ; inodeBlockOffset = file . readInt () ; dataBlockOffset = file . readInt () ; file . skipBytes ( blockSize - 18 ) ; } } __MACOSX/filesys/._SuperBlock.java filesys/tee.class public synchronized class tee { public static final String PROGRAM_NAME = tee; public static final int BUF_SIZE = 4096; public static final short OUTPUT_MODE = 448; public void tee(); public static void main(String[]) throws Exception; } filesys/tee.java filesys/tee.java /** * Reads standard input and writes to standard output * and the named file. * A simple tee program for a simulated file system. *

* Usage:


 *   java tee output-file

 *

public

class
tee

{

/**

* The name of this program.

* This is the program name that is used

* when displaying error messages.

public

static

final

String
PROGRAM_NAME
=

“tee”

;

/**

* The size of the buffer to be used for reading from the

* file. A buffer of this size is filled before writing

* to the output file.

public

static

final

int
BUF_SIZE
=

4096

;

/**

* The file mode to use when creating the output file.

public

static

final

short
OUTPUT_MODE
=

0700

;

/**

* Copies standard input to standard output and to a file.

*
@exception
java.lang.Exception if an exception is thrown

* by an underlying operation

public

static

void
main
(

String
[]
argv
)

throws

Exception

{

// initialize the file system simulator kernel

Kernel
.
initialize
()

;

// print a helpful message if the number of arguments is not correct

if
(
argv
.
length
!=

1

)

{

System
.
err
.
println
(
PROGRAM_NAME
+

“: usage: java ”

+
PROGRAM_NAME
+

” output-file”

)

;

Kernel
.
exit
(

1

)

;

}

// give the command line argument a better name

String
name
=
argv
[
0
]

;

// create the output file

int
out_fd
=

Kernel
.
creat
(
name
,
OUTPUT_MODE
)

;

if
(
out_fd
< 0 ) { Kernel . perror ( PROGRAM_NAME ) ; System . err . println ( PROGRAM_NAME + ": unable to open output file \"" + name + "\"" ) ; Kernel . exit ( 2 ) ; } // create a buffer for reading from standard input byte [] buffer = new byte [ BUF_SIZE ] ; // while we can, read from standard input int rd_count ; while ( true ) { // read a buffer full of data from standard input rd_count = System . in . read ( buffer ) ; // if we reach the end (-1), quit the loop if ( rd_count <= 0 ) break ; // write what we read to the output file; if error, exit int wr_count = Kernel . write ( out_fd , buffer , rd_count ) ; if ( wr_count <= 0 ) { Kernel . perror ( PROGRAM_NAME ) ; System . err . println ( PROGRAM_NAME + ": error during write to output file" ) ; Kernel . exit ( 3 ) ; } // write what we read to standard output System . out . write ( buffer , 0 , rd_count ) ; } // close the output file Kernel . close ( out_fd ) ; // exit with success Kernel . exit ( 0 ) ; } } __MACOSX/filesys/._tee.java __MACOSX/Data structures assignment/._filesys.zip Data structures assignment/week1 CI583: Data Structures and Operating Systems 1/76 What are you studying, and why? Data structures and algorithms, with a focus in the latter part of the module on their application in operating systems. Data structures and algorithms are programming! They provide the fundamental problem-solving tools for the computer scientist and software engineer. 2/76 What are you studying, and why? Choosing the right representation of a problem (the data structure) and strategy for attacking the problem (the algorithm) is the bulk of the work of finding a solution. On this module, you will learn to recognise and understand these standard tools, enabling you to make effective use of them in their own work. 3/76 What are you studying, and why? It’s tempting to think of this as a ”solved problem”: When I want to use a stack|hashtable|tree, I use the one from my standard library. What kind of idiot writes their own? This isn’t hard to rebut. 4/76 What are you studying, and why? there are many open problems in this subject, as we’ll see, such as finding new algorithms for parallel computing, the programmer who really understands these concepts can make best use of them, even if that means picking the right data structure from the standard library and implementing the right well-known algorithm. In fact, whatever kind of programming you do, you need to invent your own algorithms and reason about their performance. 5/76 What will you learn? By the end of the module you should be able to: Assess how the choice of data structures and algorithm design methods impacts the performance of programs. Choose the appropriate data structure and algorithm design method for a specified application. Write programs using object-oriented design principles. 6/76 What will you learn? By the end of the module you should be able to: Solve problems using data structures such as linked lists, stacks, queues, hash tables, binary trees, heaps and binary search trees. Solve problems using algorithm design methods such as the greedy method, divide and conquer, dynamic programming and backtracking. 7/76 How will you learn it? Contact time is organised as follows: Weekly lectures, which provide the theoretical basis for the various topics. These will be available as videos/slides on studentcentral, and I will be online (on MS Teams) to discuss the material during the timetabled slot. Weekly lab sessions in which you will be able to experiment with and implement some of the ideas covered in lectures. Once the assignment has been given out, the labs can also be used as a clinic for that. Labs will be face-to-face (ie on campus). 8/76 How will you learn it? Equally important is the independent study we expect you to do – this should come to no less than the standard minimum of 4 hours a week, though you will need to put more time in at various points, and how you organise it is up to you. Hint: If you do not put in this independent study time, you will probably struggle to keep up with the material. 9/76 How is it assessed? 100% on the programming assignment and reflective report, handed out roughly one third of the way through the module. This is a programming problem, in which you will implement some of the data structures and algorithms described in the lectures. You will also submit a reflective report on the complexity of the algorithms involved and of your solution. There is no exam on this module. 10/76 Module overview We will look at differing approaches to representing data. These include linked lists, arrays, stacks, queues, trees, including binary trees, and search trees. We will look at the pros and cons of each, and how to implement them. 11/76 Module overview We will look at a variety of approaches to searching, sorting and selecting data. In the process of doing this we will consider algorithmic strategies such as the greedy method, divide-and-conquer, dynamic programming, and backtracking. We will also see how we can use chance to design elegant algorithms. We will look at ways of analysing the performance of algorithms using simple mathematical methods. 12/76 Resources Representative books and a web resource: Goodrich and Tamassia, Data Structures and Algorithms in Java (4th edition), John Wiley \& Sons. Cormen et al., Introduction to Algorithms (3rd edition), MIT Press. Part of an online course from Stanford University: https://www.coursera.org/course/algo For the mathematically-inclined completist: http://www-cs-faculty.stanford.edu/~uno/taocp.html 13/76 https://www.coursera.org/course/algo http://www-cs-faculty.stanford.edu/~uno/taocp.html Resources It is always helpful to be able to visualise new data structures. When you encounter a new one you should have a play with it – https: //www.cs.usfca.edu/~galles/visualization/Algorithms.html Demo 14/76 https://www.cs.usfca.edu/~galles/visualization/Algorithms.html https://www.cs.usfca.edu/~galles/visualization/Algorithms.html Introduction An algorithm is simply a finite list of precise instructions designed to accomplish a particular task. We will sometimes present implementations of a given algorithm using Java, and sometimes using pseudo-code. 15/76 Pseudo-code Pseudo-code gives the logic and control flow of a program. It is not intended to be in any particular language, but hopefully you could easily translate it into any that you know. -- Find the largest natural number that divides both a and b -- without leaving a remainder. function gcd (a, b) while b != 0 if a > b

a <- a-b else b <- b-a endIf endWhile return a end 16/76 Introduction A data structure is an object that collects related data and behaviour, such as a Java class. An abstract data structure (ADT) defines data and behaviour that is common to a number of concrete data structures. 17/76 Abstract and concrete data structures abstract class Shape { public abstract double area(); } class Square extends Shape { double width; public double area() { return width*width; } } class Circle extends Shape { double radius; public double area() { return PI * (radius * radius); } } 18/76 Simple collections: array Probably the simplest data structure that represents a collection of values is the array. An array is a collection with a fixed size (determined when the array is created). In typed languages (such as Java) each element in the array has the same type. 19/76 Simple collections: array We access elements in the array using an index, a number that identifies the position in the array. We start counting at zero, so valid indices are between 0 and one less than the length of the array. 20/76 Simple collections: array From a low-level point of view, we can think of an array as a convenient way to access a series of memory locations. From a higher-level, we might think of the array as a series of ”letter boxes” or ”pigeon holes”. 21/76 Simple collections: array Given an array, a, with 10 indices, we access the ith element as a[i]. The first element is a[0] and if i>9, we get a runtime error.
Accessing an element has a fixed cost and is very efficient –
getting the value of the 10th element has the same cost as
getting the value of the 1st.

22/76

Simple collections: linked list

An equally simple data structure is the linked list.
A linked list is a collection with no fixed size. In typed
languages, all elements must have the same type.
When we create a new list, it is empty. Then we can cons (add,
insert) elements to the head of the list.
The head is the first (most recently consed) element of the list.
Everything after that is called the head.

23/76

Searching

Suppose we have an array containing unsorted data and we
need to find a particular element, x.

23 7 5 31 9 18 4 32 6

Our only option is to examine each element in the array, y, and
check whether x=y.
As simple as it is, this is our algorithm, called sequential
search.
How many steps will this take for an array of length n?

24/76

Searching

To see how many steps it will take to search our array (of
length n) for an element, x, there are several cases we need to
consider.

23 7 5 31 9 18 4 32 6

The best case.
The worst case.
The average case.

25/76

Searching

In the best case scenario, x is the first element in the array.
This will take us one step for any value of n.
In the worst case scenario, x is the last element in the array, or
is not found. This will take n steps.

26/76

Searching

The average case is harder to reason about.
It is sometimes important to consider, but we normally
categorise algorithms by the lower and upper bounds of their
performance.
Often the lower bound (best case) is not that revealing,
because we can’t rely on getting lucky!

27/76

Searching

What if we are able to guarantee that the array will be sorted
before we start the search?

2375 319 184 326

Then we can come up with better algorithms to do the
searching.
In particular, as soon as we get to an element greater than the
one we’re looking for, we can give up.

28/76

Searching

— return the position of x in the array, a, or -1
— if x is not in a
function search(x, a)
i <- 0 while i < length(a) if a[i] = x return i elif a[i] > x
return -1

endif
i <- i+1 endwhile end 29/76 Searching 2375 319 184 326 What are best and worst cases for the new algorithm? 30/76 Searching 2375 319 184 326 Unchanged! However, the average case will be improved. 31/76 Searching 2375 319 184 326 Let’s try again. What if we start in the middle of the array? Then either we find x straight away, or the element we’re looking at is greater than or less than x. 32/76 Searching In either case, now we only need to consider half of the array. At one step, we have halved the size of the problem. We can then apply the same step repeatedly. This is called binary search. 33/76 Binary search Searching the list when x=5 2375 319 184 326 Step 1 Pick the middle element (n/2), and call it y. y is greater than x, so ignore everything to the right of y and search again. 34/76 Binary search Searching the list when x=5 2375 319 184 326 Step 2 Pick the new middle element and call it y. Again, y is greater than x, so ignore everything to the right of y and search again. 35/76 Binary search Searching the list when x=5 2375 319 184 326 Step 3 Pick the new middle element and call it y. This time, y = x and we are done. 36/76 Binary search Binary search halves the size of the problem at each step. It performs incredibly well: searching a list of one million items won’t take more than twenty steps. 37/76 Binary search Steps required to find an element in an ordered array of length n. n Steps 10 4 100 7 1000 10 10,000 14 100,000 17 1,000,000 20 10,000,000 24 100,000,000 27 1000,000,000 30 You can check this by repeatedly halving n until it is too small to divide further. 38/76 Binary search -- Find the position of x in the array a -- or -1 if x is not found function bsearch (x, a) start <- 0 end <- length(a) while start <= end middle = (start + end) / 2 if a[middle] < x start = middle + 1 elif a[middle] = x return middle else -- must be a[middle] > x
end = middle – 1

endif
endwhile
return -1

end
39/76

Binary search

So why don’t we make all our arrays sorted? Consider the cost
of inserting an element:

2375 319 184 32612

This is now an expensive operation that may require relocating
many elements.
The same goes for deletion.
We will look into this sort of trade-off in detail during the
module.

40/76

The linked list

More versatile, but equally simple as the array, the linked list
has many uses and variations.
Each element in the list contains a value (the data item) and a
link to the next item in the list.
Head

. . .
Tail

41/76

The linked list

We call the first element in the list the head, everything else
the tail, and the last element links to nothing.
We call the operation of sticking a new element on the front of
the list cons.
Getting access to the head and consing are cheap operations
with a fixed cost.
Unlike the array, accessing the nth element takes n steps.

42/76

The linked list

One of the ways of using a linked list in Java is to use the
ArrayList class.
Or we could write our own. A class for nodes in the list:
private class Node {

int data
Node next

public Node(int data) {
this.data = data;
next = null;

}
}

43/76

The linked list
A class for the list itself:
public class LinkedList {

Node head;

public LinkedList(int data) {
head = new Node(data);

}

public void cons (int data) {
Node n = new Node(data);
Node last = head;
while (last.next != null) last = last.next;
last.next = n;

}

44/76

After the break

Next time we will introduce some simple mathematical
notation for describing the time and space costs of an
algorithm, called its complexity.
We will see that we can categorise all algorithms into classes
which have the same complexity.
We will use our new notation to discuss the complexity of some
of the operations we have been discussing so far.

45/76

Complexity

Recall the two algorithms for searching that we discussed last
time – sequential search and binary search.
These algorithms perform very differently, especially for large
inputs.

46/76

Complexity

In order to understand the algorithms (and thus the programs)
we create, we need to understand two things:

how much time they take to run for a given input, and
how much memory they will consume whilst they’re
running.

47/76

Complexity

We’re not much interested in the actual time an algorithm will
take because this will vary with the hardware used.
So, we measure the number of steps the algorithm will take for
a given size of input, and how this increases with the size of
the input.
We call the measure of the steps taken relative to the size of
the input the time complexity (or just complexity) of the
algorithm.
The measure of the memory consumed is called the space
complexity.

48/76

Mathematical background

Usually, the time complexity is the most important measure
and when we refer to the complexity of an algorithm without
specifying which type, it’s the time complexity we mean.
There are a few simple mathematical ideas we need in order to
describe complexity.

49/76

Floor and Ceiling

If n is a number then we say the floor of n, written ⌊𝑛⌋, is the
largest integer that is less than or equal to n.
Similarly, we say that the ceiling of n, written ⌈𝑛⌉ is the
smallest integer that is greater than or equal to n.

50/76

Floor and ceiling

For positive numbers, this is just rounding up and down.
So, ⌊2.5⌋ is 2 and ⌈2.5⌉ is 3.
We use this most often when talking about the complexity of
an algorithm that depends on dividing the input in some way.

51/76

Floor and ceiling

So, if we need to compare the elements of a list of length n
pairwise (compare elements 1 and 2, then elements 3 and 4,
etc.), the number of steps required is ⌊𝑛/2⌋.
If n=10 then we need ⌊10/2⌋ = 5 steps.
If n=11 then we need ⌊11/2⌋ steps, which is also equal to 5.

52/76

Factorial

The factorial of a number, n, written n!, is the product of the
numbers between 1 and n.
So, 4! = 1 x 2 x 3 x 4 = 24
You can see that factorials will get very big very quickly.

53/76

Logarithms

Logarithms play a very important role in the analysis of
complexity.
You can think of them as the dual of raising a number to some
power.
The logarithm, base y, of a number x is the power of y that will
produce x.
Or,

log𝑦 𝑥 = 𝑧 ⇔ 𝑦𝑧 = 𝑥.

54/76

Logarithms

So, log10 45 is (roughly) 1.6532 because 101.6532 is (roughly)
45.
The base of a logarithm can be any number, but we will
normally use 2 or 10.
We use log as a shorthand for log10 and lg as a shorthand for
log2.

55/76

Logarithms

Logarithms are strictly increasing functions, so if x>y then
log 𝑥 > log 𝑦. Other useful things to know:

log𝑏 1 = 0 (because 𝑏0 = 1).
log𝑏 𝑏 = 1 (because 𝑏1 = 𝑏).
log𝑏(𝑥 × 𝑦) = log𝑏 𝑥 + log𝑏 𝑦.
log𝑏 𝑥𝑦 = 𝑦 × log𝑏 𝑥.
log𝑎 𝑥 = (𝑙𝑜𝑔𝑏𝑥)/(log𝑏 𝑎).

We can use these identities to simplify equations, change the
base of logs, etc.
We will also use simple ideas from probability and summations
like ∑10𝑛=1 𝑛

56/76

Calculating complexity

Say we have an algorithm, A, that takes a list of numbers, l, of
length n. A works in two stages:

Do something once for element of 𝑙 (e.g. double the
number), then
compare every element of 𝑙 to every other element in the
list.

We can see that the first stage will take n steps and the second
𝑛2 steps.

57/76

Calculating complexity

We can describe the complexity of A with a function, f:

𝑓 (𝑛) = 𝑘𝑛 + 𝑗𝑛2,
where k is the constant cost of doubling a number and j is the
constant cost of comparing two numbers.

58/76

Calculating complexity

Disregarding the constants for a moment, when n=5, f(n)
works out as 5+25 steps.

Here, n and 𝑛2 are ”fairly similar” values.
When n=100, A takes 100+10,000 steps.

Now, n is starting to become much less significant than 𝑛2.

59/76

Calculating complexity

As n increases further still, we can effectively forget about that
part of A that takes n steps as the part that takes 𝑛2 will
dominate.
So, we say that the complexity of A is determined by the
largest term, 𝑛2, and we forget about the smaller terms.

60/76

Calculating complexity

A similar reasoning applies to the constants k and j.

In the 𝑛2 stage of A, numbers are compared to each other, an
operation which has a fixed cost, j.

So the complexity is really determined by 𝑗𝑛2 but since j
never varies, we ignore it for the sake of clarity.

For any other algorithm with the same largest term, 𝑛2, we say
it has the same order of complexity as A, even though the
details (e.g. constants and smaller terms) may differ widely.

61/76

Calculating complexity

A second algorithm, B, might have a more expensive operation
performed 𝑛2 times, governed by a different constant j’, and
other smaller terms:

𝑔(𝑛) = 𝑘′( 3𝑛2 ) + 𝑗
′𝑛2.

Nevertheless, we say that A and B have the same order.
This might seem a very approximate measure, and it is, but it
tells us what will happen as the size of the input to A and B
grows.

62/76

Calculating complexity

As well as working out the order of algorithms, we can
categorise them as follows:

Those that grow at least as fast as some function f. This
category is called Big-Omega, written Ω(𝑓 ).
Those that grow at most as fast as some function f. This is
the most useful category, called Big-O, and written 𝑂(𝑓 ).
Those that grow at the same rate as some function f. This
category is called Big-Theta, written Θ(𝑓 ).

We are not usually very interested in Ω(𝑓 ). Θ(𝑓 ) is sometimes
of interest, but most of the time we are concerned with 𝑂(𝑓 ).

63/76

Calculating complexity

As an easy example, consider the Binary Search algorithm
from the previous lecture.
At each step in the algorithm, the size of the problem is halved,
until we are done.
Let n be the length of the input to the Binary Search algorithm.
Furthermore, let 𝑛 = 2𝑘 − 1 for some k.
After the first pass of the loop, there are 2𝑘−1 − 1 elements in
the first half of the list, 1 in the middle, and 2𝑘−1 − 1 in the
second half.
After each pass of the loop, the power of 2 decreases by 1.

64/76

Calculating complexity

In the worst case, we continue until n=1, which is also when k
is 1, since 21 − 1 = 1.
This means there are at most k passes when 𝑛 = 2𝑘 − 1.
Solving this equation gives us

𝑘 = lg(𝑛 + 1).

So Binary Search has a worst case complexity of 𝑂(lg 𝑛 + 1),
or just 𝑂(lg 𝑛).
A logarithm of one base can be converted to another by
multiplying by a constant factor, so we just say 𝑂(log 𝑛).

65/76

Growth rates of algorithms
The complexity of most algorithms we come across are
governed by some commonly occurring functions (this graph is
an approximation):

Number of items (n)

N
u
m
b
e
r

o
f

s
t
e
p
s

5 10 15 20 25
0

O(1)

O(log n)

O(n)

O(n )2

O(n!)
O(2 )

66/76

Growth rates: Constant time. O(1)

Constant time means that no matter how large the input is,
the time taken doesn’t change.
Some examples of O(1) operations:

Determining if a number is even or odd.
Accessing an element in an array.
Using a constant-size lookup table or hash table.

67/76

Growth rates: Logarithmic time, O(log n)

An algorithm which cuts the problem in half each time is
logarithmic.
(Other patterns of computation end up being logarithmic, but
this is a simple rule of thumb for spotting one.)
An O(log n) operation will take longer as n increases, but once
n gets fairly large the number of steps required will increase
quite slowly.

68/76

Growth rates: Linear time, O(n)

Linear time means that for every element, a constant number
of operations is carried out, such as comparing each element
to a known value.
The larger the input, the longer a O(n) operation takes.
Every time you double n, the operation will take twice as long.
An example of a linear time operation is finding an item in an
unsorted list using sequential search.

69/76

Growth rates: Loglinear time, O(n log n)

O(n log n) means that /structure{an O(log n) operation is
carried out for each item in the input}.
Several of the most efficient sort algorithms are in this order.
Examples of loglinear operations are quicksort (in the average
and best case), heapsort and merge sort.

70/76

Growth rates: Quadratic time: 𝑂(𝑛2)

Quadratic time often means that for every element, you do
something with every other element, such as comparing them.
The time taken for a quadratic operation increases drastically
with the input size.

Examples of 𝑂(𝑛2) operations are quicksort (in the worst
case) and bubble sort.

71/76

Growth rates: Exponential time: 𝑂(2𝑛)

Exponential time means that the number of steps required will
rise by a power of two with each additional element in the
input data set.
This is a figure that gets very big very fast!
Exponential operations are normally impractical for any
reasonably large n, although of course many problems may
require an exponential algorithm.
An example of an 𝑂(2𝑛) operation is the famous travelling
salesman problem with a solution that uses dynamic
programming.
We will study this problem in later weeks.

72/76

Growth rates: Factorial time: O(n!)

A factorial time solution involves doing something for all
possible permutations of the n elements.
An operation with this complexity is impractical for all but
small values of n.
An example of an O(n!) operation is the travelling salesman
problem using brute force, where every combination of paths
will be examined.

73/76

Growth rates of algorithms}

Another way of visualising the growth rates of the frequently
encountered orders:

O(f(n)) n=10 n=100
O(1) 1 1
O(log n) 3 7
O(n) 10 100
O(n log n) 30 700
𝑂(𝑛2) 100 1000
𝑂(2𝑛) 1024 2100
O(n!) 3,628,800 100!

74/76

Growth rates of algorithms

75/76

Next week’s lecture

Next week: Simple sorting methods (bubble sort, selection
sort and insertion sort), their implementation and
performance.

76/76

__MACOSX/Data structures assignment/._week1

Data structures assignment/week6c

CI583: Data Structures and Operating Systems

Algorithms for parallel computing

1 / 17

Outline

1 Parallel computing

2 MapReduce

3 A model for parallelism

4 Review

2 / 17

Moore’s Law slows down

For several decades, computer manufacturers were able to increase
the processing power and memory capacity of individual CPUs at
an exponential rate.

In 1965 the original expression of Moore’s Law stated that the
number of components in a CPU would double every year and it
proved to be accurate. By the 2000s this had slowed down to a
doubling in power every 18 months to two years, and the trend is
likely to continue.

Manufacturers increasingly focus their e�orts on increasing the
number of CPUs available to a single machine.

3 / 17

Parallel computing and the cloud

At the same time, advances in parallel computing make it easier to
distribute a problem amongst cores and amongst remote machines.

One of the big contributors to this is the advent of cloud
computing and services such as Amazon S3 which provide �exible
storage and processing power distributed across large clusters of
commodity machines (as opposed to very expensive
supercomputers, available only to tiny numbers of researchers).

4 / 17

Popular parallel programming

As an example of how simple parallel computing is becoming, open
frameworks such as Hadoop1 allow the easy creation of large
clusters, something which was a research topic in itself quite
recently.

The Akka2 framework (for Java and Scala) allows an algorithm to
be distributed over a large number of nodes. Their �Hello World�
example is a distributed algorithm for computing π to arbitrary
decimal places, and does so in very few lines of code.

1
http://hadoop.apache.org/

2
http://doc.akka.io/docs/akka/2.0.1/intro/

getting-started-first-java.html
5 / 17

http://hadoop.apache.org/

http://doc.akka.io/docs/akka/2.0.1/intro/getting-started-first-java.html

Big Data

At the same time the way we use computers, and the internet
especially, means that massive amounts of data are generated every
hour, minute, second…

If you had to design an algorithm to mine Twitter for upcoming
trends, how would you do it? Would you design it to run on a
single machine?

Big Data is the term coined for this environment: sequential
algorithms (and other traditional technologies such as relational
databases) break down in the face of very large datasets but
parallelism can help to make many of the problems easier.

6 / 17

MapReduce

A prominent example of this new paradigm is Google MapReduce.
The original paper3 is well worth reading.

MapReduce is a design pattern rather than an algorithm. It
depends on breaking down a problem (such as processing a large
log �le to count the requests from each country, from example)
into many small parts with no shared state so that each part can be
worked on separately by �worker nodes�. The results are later
combined by a �master node�.

3http://research.google.com/archive/mapreduce.html
7 / 17

MapReduce

Image c©http://en.wikipedia.org/wiki/MapReduce

8 / 17

http://en.wikipedia.org/wiki/MapReduce

A model for parallelism

The main subject of this lecture is not frameworks such as
MapReduce or Akka, but the principles of designing algorithms that
can run inside them.

We need to develop a model for analysing parallel algorithms. It
will include the notions of speed up and cost and scalability.

9 / 17

A model for parallelism

The speed up is the improvement a parallel algorithm provides over
an optimal sequential version.

So, we know that sequential MergeSort runs in O(n log n) time. If
we can provide a parallel version that runs in O(n) time then the
speed up is O(log n). Note that we count each processor taking a
step as one step � we are measuring time, not work.

The cost of a parallel algorithm is the measure of work: the time it
takes multiplied by the processors required. If our O(n) parallel
MergeSort requires at least n processors, then its cost is O(n2).

10 / 17

A model for parallelism

The scalability of a parallel algorithm is related to the number of
processors required.

If our parallel MergeSort requires n processors, it is not usable for
large values of n. The sequential algorithm has no such size
restriction.

We are only interested in scalable parallel algorithms: those for
which the numbers of processors required is signi�cantly less than
the size of any potential input and where we can increase the input
size without needing to add processors.

11 / 17

Summary

Parallel algorithms require careful design, especially in the presence
of shared state.

One way around this is to design our algorithms in a functional
style, so that they don’t require shared state. An algorithm like this
should be easy to deploy in a framework like Akka. In fact, libraries
like Akka intend to make many of the error-prone parts of parallel
programming happen �under the bonnet�.

Either way, making use of the multiple cores available on a typical
modern machine and of environments like the cloud are increasingly
important techniques. Parallel programming is for all of us
nowadays!

12 / 17

Advice from a Grand Master Programmer

Rob Pike (inventor of C, author of the Go programming language
and many other things) has this to say about program complexity4:

Pike’s Rules of Complexity

Most programs are too complicated – that is, more complex than
they need to be to solve their problems e�ciently. Why? Mostly
it’s because of bad design […]. But programs are often complicated
at the microscopic level, and that is something I can address here.

4Some of Rob Pike’s thoughts on programming:
http://doc.cat-v.org/bell_labs/pikestyle

13 / 17

http://doc.cat-v.org/bell_labs/pikestyle

Advice from a Grand Master Programmer

Rule 1. You can’t tell where a program is going to spend its
time. Bottlenecks occur in surprising places, so don’t try to
second guess and put in a speed hack until you’ve proven
that’s where the bottleneck is.

Rule 2. Measure. Don’t tune for speed until you’ve measured,
and even then don’t unless one part of the code overwhelms
the rest.

Some of Rob Pike’s thoughts on programming: http://doc.cat-v.org/bell_labs/pikestyle

14 / 17

http://doc.cat-v.org/bell_labs/pikestyle

Advice from a Grand Master Programmer

Rule 3. Fancy algorithms are slow when n is small, and n is
usually small. Fancy algorithms have big constants. Until you
know that n is frequently going to be big, don’t get fancy.
(Even if n does get big, use Rule 2 �rst.) For example, binary
trees are always faster than splay trees for workaday problems.

Some of Rob Pike’s thoughts on programming: http://doc.cat-v.org/bell_labs/pikestyle

15 / 17

http://doc.cat-v.org/bell_labs/pikestyle

Advice from a Grand Master Programmer

Rule 4. Fancy algorithms are buggier than simple ones, and
they’re much harder to implement. Use simple algorithms as
well as simple data structures.
The following data structures are a complete list for almost all
practical programs:

array

linked list

hash table

binary tree

Of course, you must also be prepared to collect these into
compound data structures. For instance, a symbol table might
be implemented as a hash table containing linked lists of arrays
of characters.

Some of Rob Pike’s thoughts on programming: http://doc.cat-v.org/bell_labs/pikestyle

16 / 17

http://doc.cat-v.org/bell_labs/pikestyle

Advice from a Grand Master Programmer

Rule 5. Data dominates. If you’ve chosen the right data
structures and organized things well, the algorithms will almost
always be self-evident. Data structures, not algorithms, are
central to programming. (See The Mythical Man-Month:
Essays on Software Engineering by F. P. Brooks, page 102.)

Rule 6. There is no Rule 6.

Some of Rob Pike’s thoughts on programming: http://doc.cat-v.org/bell_labs/pikestyle

17 / 17

http://doc.cat-v.org/bell_labs/pikestyle

Parallel computing
MapReduce
A model for parallelism
Review

__MACOSX/Data structures assignment/._week6c

Data structures assignment/week6b

CI583: Data Structures and Operating Systems

More probabilistic and non-deterministic

algorithms

1 / 14

Probabilistic strategies

As well as algorithms that return an approximate answer, there are
other signi�cant categories of probabilistic algorithm.

These can often improve on the running time of a deterministic
algorithm, or can be used to address problems for which there is no
known e�cient, deterministic solution:

1 Monte Carlo algorithms,

2 Las Vegas algorithms, and

3 Sherwood algorithms.

2 / 14

Monte Carlo algorithms

A Monte Carlo algorithm always gives a result but it might be
wrong, like our primality test.

The probability that the result is correct increases the longer the
algorithm is allowed to run.

We are only interested in those algorithms that return a correct
answer more than half the time.

3 / 14

Monte Carlo algorithms

If there are several choices that can be made during execution, an
algorithm that will always make the same choices is called
consistent.

As well as allowing the algorithm to run for a longer time, we can
improve a consistent algorithm by calling it multiple times and
choose the answer that appears most often.

4 / 14

Monte Carlo algorithms

Suppose we want to discover whether an array contains a �majority
element�, one which appears in more than half of the locations.

The most obvious solution, comparing every element to every other
element, takes O(n2) steps.

A Monte Carlo solution would pick an element at random and
check whether it appears in more than half of the array.

If the algorithm returns true, it is de�nitely correct. If it returns
false, it might just have picked the wrong element.

5 / 14

Monte Carlo algorithms

However, if there is a majority element, the chance of picking it is
more than 50% and increases the more often the element appears.

If we call Majority 5 times, the likelihood of it being right
increases to 97% at a cost of 5n steps, making it O(n).

procedure Majority(array, n)
x ← Random(1, n) . Random number between 1 and n
count ← 0
for i from 0 to n −1 do

if array[i] = array[choice] then
count ← count + 1

end if

end for

return count > n/2
end procedure

6 / 14

Las Vegas algorithms

A Las Vegas algorithm will never return the wrong answer but it
might return no answer at all.

A program that uses a Las Vegas algorithm would call it repeatedly
until it succeeds or gives up.

Recall the N-Queens problem, for which we presented a
backtracking solution in lecture 4a.

We need to place n queens on an n × n chessboard so that none of
the queens is attacked.

A Las Vegas solution would place queens randomly, row by row,
and give up if there is nowhere to put a queen on the next row.

7 / 14

Probabilistic N-Queens

For each row, inspect each column location.

If it is not attacked, add it to a list of possible places for the next
queen. If this list remains empty, give up.

If there are m possible places for this queen in the current row,
place it in one of them so that each place has a 1/m chance of
being picked.

If we get all the way to the last row, return the solution.

8 / 14

Probabilistic N-Queens

procedure LVQueens(result) . an empty array that will hold column
positions

for row from 0 to 7 do
places ← 0
for col from 0 to 7 do

if board(row, col) is not attacked then
places ← places + 1
if Random(1, places) = 1 then

try ← col
end if

end if

end for

if places > 0 then
result[row] = try

else

return FAIL
end if

end for

return result

end procedure

9 / 14

Probabilistic N-Queens

A full statistical analysis will show that the chance of success is just
under 13% for 8 queens and that, on average, a failure will be
detected in fewer than 7 passes.

This means that we could usually �nd a solution in about 55 passes.

An optimised backtracking solution to 8-Queens is better than this,
but a recursive solution, for instance, uses about twice as many
passes.

See http://penguin.ewu.edu/~trolfe/QueenLasVegas/
QueenLasVegas.html for an analysis of this algorithm and a faster
Las Vegas solution.

10 / 14

http://penguin.ewu.edu/~trolfe/QueenLasVegas/QueenLasVegas.html

Sherwood algorithms

A Sherwood algorithm always returns the correct answer. So far,
sounds deterministic!

Sherwood algorithms introduce a coin toss to deterministic
algorithms in which the best, average and worst complexity di�er
by a large amount depending on the input.

Consider BinarySearch, in which we usually start by picking an
element in the middle of the list.

In the average and best cases, this is the most sensible place to
start.

11 / 14

Sherwood algorithms

A Sherwood BinarySearch would pick the starting position at
random.

Sometimes this would pay o�, meaning that we discard more than
half of the elements straight away, and sometimes it wouldn’t.

The name comes from Robin Hood (of Sherwood Forest), who
famously robbed the rich to pay the poor: we reduce the chance of
the occurrence of the worst case, at the cost of making the best
case less likely.

12 / 14

Probabilistic strategies

We have described algorithms that return an approximate answer,
such as �maybeprime�, and which improve the longer they run.

These are Monte Carlo algorithms � as long as the approximate
answer, the maybe, is right more than half the time, we can
improve the likelihood of a correct answer.

Las Vegas algorithms might not give an answer at all, but will not
return the wrong one.

Sherwood techniques can be applied to any deterministic algorithm
without a�ecting its correctness but can reduce the chance of
worst-case behaviour.

13 / 14

Next time

Algorithms for parallel computing and Big Data.

14 / 14

Other probabilistic strategies

__MACOSX/Data structures assignment/._week6b

Data structures assignment/CI283 FileSystem Simulation

CI283

File System Simulator

The file system simulator written in Java simulates a UNIX file system. The
simulator reads or creates a file which represents the disk image, and keeps track of
allocated and free blocks using a bit map

The simulator classes are in filesys.zip, which you should unzip first before running
the file system simulator classes.

This File System Simulator is a collection of Java classes, which simulate the file
system calls available in a typical Unix-like operating system. The “Kernel” class
contains methods (functions) like “creat()”, “open()”, “read()”, “write()”, “close()”, etc.,
which read and write blocks in an underlying file in much the same way that a real
file system would read and write blocks on an underlying disk device.

In addition to the “Kernel” class, there are a number of underlying classes to support
the implementation of the kernel. The classes FileSystem, IndexNode,
DirectoryEntry, SuperBlock, Block, BitBlock, FileDescriptor, and Stat contain all data
structures and algorithms which implement the simulated file system.

Also included are a number of sample programs which can be used to operate on a
simulated file system. The Java programs “ls”, “cat”, “mkdir”, “mkfs”, etc., perform
file system operations to list directories, display files, create directories, and create
(initialize) file systems. These programs illustrate the various file system calls and
allow the user to carry out various read and write operations on the simulated file
system.

There is a backing file for rhe simulated file system. A “dump” program is included
with the distribution so that you can examine this file, byte-by-byte. Any dump
program may be used (e.g., the “od” program in Unix);

There are a number of ways you can use the simulator to get a better understanding
of file systems. You can

• use the provided utility programs (mkfs, mkdir, ls, cat, etc.) to perform
operations on the simulated file system and use the dump program to
examine the underlying file and observe any changes,

• examine the sample utility programs to see how they use the system call
interface to perform file operations,

• enhance the sample utility programs to provide additional functionality,
• write your own utility programs to extend the functionality of the simulated file

system, and

• modify the underlying Kernel and other implementation classes to extend the
functionality of the

In the sections which follow, you will learn what you need to know to perform each of
these activities.

Using File System Simulator Programs

Using mkfs

The mkfs program creates a file system backing file. It does this by creating a file
whose size is specified by the block size and number of blocks given. It writes the
superblock, the free list blocks, the inode blocks, and the data blocks for a new file
system. Note that it will overwrite any existing file of the name specified, so be
careful when you use this program.

This program is similar to the “mkfs” program found in Unix-like operating systems.

The general format for the mkfs command is

java mkfs file-name block-size blocks
where
file-name

is the name of the backing file to create (e.g., filesys.dat). Note that this is the
name of a real file, not a file in simulator. This is the file that the simulator
uses to simulate the disk device for the simulated file system. This may be
any valid file name in your operating system environment.

block-size
is the block size to be used for the file system (e.g., 256). This should be a
multiple of the index node (i-node) size (usually 64) and the directory entry
size (usually 16). Modern operating systems usually use a size of 1024, or
512 bytes. We use 128 or 256 byte block sizes in many of our examples so
that you can quickly see what happens when directories grow beyond one
block. This should be a decimal number not less than 64, but less than 32768.

blocks
is the number of blocks to create in the file system(e.g., 40). This number
includes any blocks that may be used for the superblock, free list
management, inodes, and data blocks. We use a relatively small number here
so that you can quickly see what happens if you run out of disk space. This
can be any decimal number greater than 3, but not greater than 224 – 1 (the
maximum number of blocks), although you may not have sufficient space to
create a very large file.

For example, the command
java mkfs filesys.dat 256 40

will create (or overwrite) a file “filesys.dat” so that it contains 40 256-byte blocks for a
total of 10240 bytes.

The output from the command should look something like this:

block_size: 256
blocks: 40
super_blocks: 1
free_list_blocks: 1
inode_blocks: 8
data_blocks: 30
block_total: 40

From the output you can see that one block is needed for the superblock, one for
free list management, eight for index nodes, and the remaining 30 are available for
data blocks.

Why is there 1 block for free list management? Note that 30 blocks require 30 bits in
the free list bitmap. Since 256 bytes/block * 8 bits/byte = 2048 bits/block, clearly one
bitmap block is sufficient to track block allocation for this file system.

Why are there 8 blocks for index nodes? Note that 30 blocks could result in 30
inodes if many one-block files or directories are created. Since each inode requires
64 bytes, only 4 will fit in a block. Therefore, 8 blocks are set aside for up to 32
inodes.

Using mkdir

The mkdir program can be used to create new directories in our simulated file
system. It does this by creating the file specified as a directory file, and then writing
the directory entries for “.” and “..” to the newly created file. Note that all directories
leading to the new directory must already exist.

This program is similar to the “mkdir” command in Unix-like and MS-DOS-related
operating systems.

The general format for the mkdir command is

java mkdir directory-path

where
directory-path

is the path of the directory to be created (e.g., “/root”, or “temp”, or
“../home/rayo/moss/filesys”). If directory-path does not begin with a “/”, then it
is appended to the path name for working directory for the default process.

For example, the command
java mkdir /home

creates a directory called “home” as a subdirectory of the root directory of the file
system.

Similarly, the command

java mkdir /home/user51

creates a directory called “user51” as a subdirectory of the “home” directory, which is
presumed to already exist as a subdirectory of the root directory of the file system.

Using ls

The ls program is used to list information about files and directories in the simulated
file system. For each file or directory name given it displays information about the
files named, or in the case of directories, for each file in the directories named.

This program is similar to the “ls” command in Unix-like operating systems, or the
“dir” command in DOS-related operating systems.

The general format for the ls command is

java ls path-name …

where
path-name …

is a space-separated list of one or more file or directory path names.
For example, the command

java ls /home

lists the contents of the “/home” directory. For each file in the directory, a line is
printed showing the name of the file or subdirectory, and other pertinent information
such as size.

The output from the command should look something like this:

/home:
1 48 .
0 48 ..
2 32 user51
total files: 3

In this case we see that the “/home” directory contains entries for “.”, “..”, and
“user51”.

Using tee

The tee program reads from standard input and writes whatever is read to both
standard output and the named file. You can use this program to create files in our
simulated file system with content created in the operating system environment.

This program is similar to the “tee” command found in many Unix-like operating
systems.

The general format for the tee command is

java tee file-path

where
file-path

is the name of a file to be created in the simulated file system. If the named
file already exists, it will be overwritten.

For example,
echo “howdy, mike” | java tee /home/user51/hello.txt

causes the single line “howdy, mike” to be written to the file “/home/user51/hello.txt”.

The output from the command is

howdy, mike

which you should note was the same as the input sent to the tee program by the
“echo” command.

Note that the “|” (pipe) is almost always used with the tee program. Users of Unix-like
operating systems will find the “echo”, and “cat” commands useful to produce input
for the pipe to tee. Users of MS-DOS-related operating systems will find the “echo”
and “type” commands to be useful in this regard.

If you wish to simply enter text directly to a file, then you may use tee directly (i.e.,
without the pipe). Users of Unix-like operating systems will need to use CTRL-D to
signal the end of input. Users of MS-DOS-related operating systems will need to use
CTRL-Z to signal the end of input.

Using cp

The cp program allows you to copy the contents from one file to another in the
simulated file system. If the destination file already exists, it will be overwritten.

This program is similar to the “cp” command in Unix-like operating systems, and the
“copy” command in MS-DOS-related operating systems.

The general format of the “cp” command is

java cp input-file-name output-file-name

where
input-file-name

is the path-name for the file to be copied (i.e., the source file, and
output-file-name

is the path-name for the file to be created (i.e., the target file.
For example,

java cp /home/user51/hello.txt /home/user51/greeting.txt

creates a new file “/home/user51/greeting.txt” by copying to it the contents of file
“/home/user51/hello.txt”.

Using cat

The cat program reads the contents of a named file and writes it to standard output.
The cat program is generally used to display the contents of a file.

This program is similar to the “cat” command in Unix-like operating systems, or the
“type” command in MS-DOS-related operating systems.

The general format of the cat command line is

java cat file-name

where
file-name

is the name of the file from which data are to be read for writing to standard
output.

For example,
java cat /home/user51/greeting.txt

causes the file “/home/user51/greeting.txt” to be read, the contents of which are
written to standard output.

In this case, the output from the program might look something like this

howdy, mike

Dumping the File System

While you are working with the file system simulator, you may wish to dump the
contents of the backing file to see if it contains what you think it contains. The dump
program shows the contents of a file in the operating environment, one byte at a
time, in various formats (hexadecimal, decimal, ASCII).

Note that dump dumps the contents of a real file, not a file in our simulated file
system.

The general format of the dump command line is

java dump file-name

where
file-name

is the name of the file to be dumped. This should generally be the name of the
backing file for the file system simulator (e.g., “filesys.dat”).

The general format of the dump output is
addr hex dec asc

where

addr
is the decimal address of the byte,

hex
is the hexadecimal value of the byte,

dec
is the decimal value of the byte, and

asc
is the corresponding ASCII character if the value is between 33 and 127
(decimal).

Each line of dump output corresponds to a single byte in the file. To keep the listing
brief, dump only displays non-zero bytes from the input file.

For example

java dump filesys.dat | more

causes the contents of the file “filesys.dat” to be displayed, one line per byte. The “|
more” causes you to be prompted for each page of the output.

The first page of the output should look something like this:

You should notice, for example, that the first block (the super block) contains a few
numeric values corresponding to the block size (the 1 in the 0 byte means 256),
number of blocks, etc. The second block (starting at byte 256) contains a few bits
that are set, indicating that the first few blocks are allocated. The third block (starting
at 512) contains a few index nodes; the FF/255 values indicate that a direct block is

unallocated. A little further down you will see “.”, and “..” for the directory entries for
the root file system, and other data blocks.

Simulator Configuration File

Each file system simulator program must call Kernel.initialize() before calling any of
the other Kernel methods. The initialize() method reads a configuration file
(“filesys.conf” is the default), opens the backing file for the file system (“filesys.dat” is
the default), and performs other initializations. This section of the user guide
describes the various options which may be set in the configuration file.

Configuration File Options

Name Description Default Value
filesystem.root.filename The name of the file containing the root file

system for the simulation.
filesys.dat

filesystem.root.mode The mode to use when opening the root file
system backing file. The mode should either
be “rw” for reading and writing, or “r” for read-
only access.

process.uid The numeric user id (uid) to use for the default
process context. This should be a number
between 0 and 32767.

process.gid The numeric group id (gid) to use for the
default process context. This should be a
number between 0 and 32767.

process.umask The umask to use for the default process
context. This should be an octal number
between 000 and 777.

022

process.dir The working directory in the simulated file
system to be used for the default process
context. This should be a string that starts with
“/”.

/root

process.max_open_files The maximum number of files that may be
open at a time by a process. When a process
context is created, this many slots are created
for possible open files.

kernel.max_open_files The maximum number of files that may be
open at one time by all processes in the
simulation. When the simulator starts, this
many slots are created for possible open files.

A Sample Configuration File

In addition to the standard configuration file, “filesys.conf”, the distribution also
includes a smaller sample configuration file, “sample.conf”. This is shown below to
illustrate a typical configuration file.

!
! personal filesys configuration file
!
filesystem.root.filename = user51.dat
filesystem.root.mode = r
process.uid = 1000
process.gid = 1000
process.umask = 002
process.dir = /home/user51

In this particular example, the file system is contained in the backing file
“user51.dat”, which is here being opened for read-only access. The working directory
for the default process context is “/home/user51”, with the uid, gid, and umask
shown.

Specifying an Alternate Configuration File

The default configuration file is named “filesys.conf” and is included in the application
distribution. You may modify this file directly to set various options, or you may
create your own configuration file and specify the name of this new file when you
launch your simulator programs.

If you choose to create your own configuration file, you will need to define a system
property “filesys.conf” which contains the name of file. For example, suppose you
wanted to run the “ls” program using “my_filesys.conf” as the configuration file. Your
java command would look something like this:

java -Dfilesys.conf=my_filesys.conf ls /home

If there is no value set for the “filesys.conf” system property, then the name
“filesys.conf” is used as the default configuration filename.

Writing File System Simulator Programs

Writing programs that use the File System Simulator requires the use of the Kernel
class, and may involve the use of the classes Stat and DirectoryEntry. If you’re
writing ordinary programs that use the standard file system calls, you should not
need to reference any other classes.

These three classes are described briefly here. For more information, follow the link
for the class to the javadoc for that class.

Kernel
sets up the simulator environment and defines all the system calls. This class
defines: the method initialize(), which is used to initialize the file system
simulator; the creat(), open(), read(), write(), close(), and other methods which

simulate the work of a file system; and constants like EBADF, S_IFDIR, and
O_RDONLY which are used to represent parameter or return values for the
system calls. All the methods and fields of Kernel are static; you do not
instantiate a Kernel object. For examples, see any of the sample programs
(i.e., cat.java, cp.java, ls.java, etc.)

Stat
is a data structure that represents information about a file or directory. This
intends to faithfully represent the Unix stat struct. You may reference fields
within a stat object directly (e.g., stat.st_ino), or using JavaBean-style
accessor/mutator methods (e.g., stat.getIno() or stat.setIno(). Stat objects are
updated by the methods Kernel.stat() and Kernel.fstat(). For examples, see
mkdir.java.

DirectoryEntry
is a data structure that represents a single record in a directory file. This
intends to faithfully represent a Unix dirent struct. It contains an index node
number and a file name. You may reference the fields directly (e.g.,
dirent.d_ino), or using JavaBean-style accessor/mutator methods (e.g.,
dirent.getIno() or dirent.setIno()). However, Java programmers my find it more
convenient to use the getName() and setName() (which use String) instead of
the field d_name (which is byte[]). DirectoryEntry objects are updated by the
method Kernel.readdir(). For examples, see mkdir.java and ls.java.

For more information about Unix system calls and the stat and dirent structs, refer to
a Unix system manual. Users of Unix-like systems may find the commands “man -S
2 creat”, “man -S 2 open”, etc. to be helpful.

All programs that use the File System Simulator should adhere to the following
guidelines:

• Invoke the method Kernel.initialize() before any other File System Simulator
calls.

• Use Kernel.exit() when you wish to terminate processing in your program.
• Check for errors after each system call (e.g., creat(), open(), read(), write(),

etc.). Nearly all the system calls return -1 if an error occurs.
• Use Kernel.perror() to print the message associated with an error.
• Use Kernel.getErrno() to determine which error occurred, if needed. Note that

in standard Unix programs you would reference the static process variable
“errno”.

For examples, take a look at the following sample programs in the distribution:

• cat.java
• cp.java
• ls.java
• mkdir.java
• tee.java

Collectively, these sample programs invoke all of the core methods (system calls) of
the file system simulator.

Enhancing the File System Simulator

The following are the internal classes for the file system simulator:

BitBlock
is a data structure that views a device block as a sequence of bits. The
methods setBit(), resetBit(), and isBitSet() are used to set, reset, or check a
bit in the block. This structure is used to implement bitmaps, and is used by
the file system simulator to track allocated and free data blocks in the file
system. BitBlock extends Block.

Block
is a data structure that views a device block as a sequence of bytes. The field
bytes is an array of byte, and is directly accessible. Included are methods to
read() and write() the block to a java.io.RandomAccessFile, which simulate
the action of reading or writing a device block.

FileDescriptor
is a structure and collection of methods that represent an open file. It includes
a number of get and set methods for various tidbits of information about the
open file, and provides readBlock and writeBlock() methods for reading and
writing the blocks of the file.

FileSystem
is a structure and collection of methods that represent an open (mounted) file
system. It includes a few get and set methods for various fields about the file
system, but more importantly, includes methods to open() the file behind the
file system, to read() and write() blocks of the device, to manage blocks
(allocateBlock() and freeBlock()) and to manage inodes
(allocateIndexNode()). In general, Kernel methods should call FileSystem
methods when they want to read or write data in the file system.

IndexNode
is a structure and collection of methods for representing an index node. This
is meant to reflect the exact structure on disk for an index node. It includes get
and set methods for each of the fields in the index node. Also included are
read() and write() methods which are used to copy data to and from byte
arrays (not disk files).

ProcessContext
is a structure and collection of methods to represent a process. This is where
the simulator stores the uid, gid, umask, dir, and other information for the
current process. It includes get and set methods for each of the fields in a
process.

SuperBlock
is a structure and collection of methods for representing the superblock on the
disk. In our implementation, the superblock contains information about the
block size, number of blocks, offsets to the first block of the free list, inode
block, and data block areas of the device. It includes get and set methods for
each of the fields in the superblock. Also included are methods to read() and
write() the superblock.

Suggested Exercises

1. Use mkfs to create a file system with a block size of 64 bytes and having a
total of 8 blocks. How many directory entries will fit in a block? Use dump to
examine the file system backing file. Use mkdir to create a directory (e.g.,

/usr), and then use dump to examine byte 64. Repeat the process of creating
a directory (e.g., /bin, /lib, /var, /etc, /home, /mnt, etc.) and examining with
dump. How many directories can you create before you fill up the file system?

__MACOSX/Data structures assignment/._CI283 FileSystem Simulation

Data structures assignment/week6a

CI583: Data Structures and Operating Systems

Introduction to probabilistic and non-deterministic

algorithms

1 / 16

Outline

1 Non-determinism and tossing coins

2 Probabilistic algorithms

2 / 16

Non-determinism

This lecture is about a class of algorithms which are quite unlike

the ones we’ve discussed so far. These are ones that make use of

probability and randomised elements.

Some of these algorithms use probability to compute approximate

results, often to NP-hard problems, and the longer they are allowed

to run the more accurate the results. Others use random choices

(called a coin toss) to solve problems more elegantly than can be

done in a deterministic fashion.

3 / 16

The dining philosophers

The dining philosophers is a standard problem in concurrency and

illustrates the issues of deadlock and starvation.

n = 5 philosophers are sitting
at table, each with a plate of

spaghetti in front of them, and

n forks are on the table.

4 / 16

The dining philosophers

Philosophers do two things:

think, and eat. They can only

eat when two forks are

available � if one or more of

the forks in front of them are

being used, they need to wait.

If they wait too long, they will

starve. They never speak to

each other.

5 / 16

The dining philosophers

Any solution has to be deadlock free (if one or more philosophers is

hungry, at least one gets to eat) and starvation free (every

philosopher eats eventually). Why can’t we just instruct each

hungry philosopher to wait for the two forks in front of him to

become free?

If the left-hand fork is free but the right-hand one is taken, then the

left-hand fork might have gone by the time the other one is free.

Also, two philosophers might try to pick up the same fork at the

same time.

If each philosopher picks up his left-hand fork, for example, no-one

will ever eat eat.

6 / 16

The dining philosophers

Any solution has to be deadlock free (if one or more philosophers is

hungry, at least one gets to eat) and starvation free (every

philosopher eats eventually). Why can’t we just instruct each

hungry philosopher to wait for the two forks in front of him to

become free?

If the left-hand fork is free but the right-hand one is taken, then the

left-hand fork might have gone by the time the other one is free.

Also, two philosophers might try to pick up the same fork at the

same time.

If each philosopher picks up his left-hand fork, for example, no-one

will ever eat eat.

6 / 16

The dining philosophers

In this problem, the act of eating can be considered a critical

section � no adjacent philosophers can eat at the same time. Also,

forks are critical resources � philosophers must take it in turns to

use them.

In this way, the problem models many real-world situations in CS

where processes compete for shared resources.

7 / 16

The dining philosophers

The standard solution is to introduce a new participant, the waiter.

The waiter shows philosophers out of the room when they have

�nished eating. When they get hungry again, they queue up at the

door.

The waiter keeps track of the philosophers at the table, restricting

it to n − 1. If this is true, then at least two philosophers have an
empty place next to them and at least one can eat.

8 / 16

The dining philosophers

Using a waiter works, but it requires us to expend extra resources.

Can we solve the problem without doing that?

We can, if we allow the philosophers to toss coins!

9 / 16

The dining philosophers

Let each philosopher do the following:

Repeat:

1 Think , until hungry

2 Toss coin to choose a direction , R or L

3 Wait until fork in chosen direction is free , then

pick it up

4 If other fork is not available:

4.1 Put down fork

4.2 Go to 2

5 Otherwise , lift second fork

6 Eat

7 Put down both forks and go to 1

10 / 16

The dining philosophers

By virtue of a coin toss, this solution is deadlock free with

probability 1 for an in�nite execution of the solution.

In order to achieve deadlock, all philosophers would have to toss

their coins at the same time and get the same result (all left or all

right), not just once but every time. The likelihood of this is zero.

This solution does allow starvation though � we can only overcome

this by allowing philosophers to communicate with their immediate

neighbours.

11 / 16

Probabilistic primes

The �rst algorithms to use probability in a systematic way were

several methods to �nd prime numbers developed in the 1970s.

Generating prime numbers is not just a hobby for mathematicians �

the ability to �nd large primes is an essential aspect of some

important algorithms, including many that deal with encryption.

However, deterministic methods for testing primality are ine�cient

and impractical for very large numbers.

12 / 16

Probabilistic primes

Solutions that uses probability turn out to be surprisingly simple

and reliable. The simplest (not the most accurate) is the Fermat

primality test.

The idea is to search at random for numbers which are witnesses to

a potential prime’s compositeness (i.e. begin non-prime). If we �nd

a witness, we can stop straight away.

The problem is to �nd a de�nition of a witness that can be tested

e�ciently and by which we know that, for each composite number,

n, more than half of the numbers up to n − 1 will be witnesses. If
we don’t �nd one, this de�nition of witness will enable us to stop

searching quite early on, with good con�dence that n is prime.

13 / 16

Probabilistic primes

Far fewer than half of the numbers between 1 and n will divide it
evenly, so n%a == 0 won’t do.

Fermat’s Little Theorem states that, if p is prime and 1 ≤ a < p then ap−1 ≡ 1 (modp). The reason that this suits our needs for a de�nition of witness depend on some fairly heavy number theory. However, if we want to test whether p is prime, we can generate random values of a and see if the equality holds. If not, a is a witness that p is de�nitely composite. Otherwise, it is probably prime. 14 / 16 Probabilistic primes If we run this algorithm on, say, 200 random numbers between 1 and p then the chances that it will say prime when it should composite are less than 1 in 2200. There are a very small number of non-primes that will pass this test (�Fermat liars�) but these are so rare that this algorithm is more than adequate for many applications. A variation on it is used by the Pretty Good Privacy (PGP) encryption tool. 15 / 16 Probabilistic primes This is a Monte Carlo algorithm. We have a procedure that will can say �no� when the answer is negative. Otherwise, it will say �maybe yes�, and when it does so the answer will actually be �yes� more than half of the time. If we run it many times and the answer is always �maybe yes�, probability tells us that �yes� is very (very) likely to be true. 16 / 16 Non-determinism and tossing coins Probabilistic algorithms __MACOSX/Data structures assignment/._week6a Data structures assignment/week7a CI583: Data Structures and Operating Systems 1 / 26 There is nothing magical about operating systems! They are software: like the programs you’ve written, but more complex (presumably). As always, we can decompose the complex problem into a series of smaller, simpler ones. 2 / 26 What is an operating system, anyway? An operating system sits in between hardware and user-level applications, coordinating the execution of processes and access to resources. Process management Interrupts Memory management File system Device drivers Networking (TCP/IP, UDP) Security (Process/Memory protection) I/O Image © http://en.wikipedia.org/wiki/Operating_system 3 / 26 http://en.wikipedia.org/wiki/Operating_system What is an operating system, anyway? Another way of thinking of the OS is to use the onion model. In the graphic below, each layer can communicate only with the layer below. (Things aren’t always so simple in practice.) API & system calls file systems I/O memory management kernel hardware userland 4 / 26 Motivation Operating systems are one of the oldest problems in software engineering, having been the focus of concentrated effort since the 1950s. Along the way, countless programming techniques, data structures and algorithms have been developed which have carried over into userland programming. For instance, anything related to concurrency (carrying out multiple tasks “simultaneously”) was first puzzled over in the context of operating systems. 5 / 26 Motivation Still, the problem is far being solved. There are no end of open problems in security, for example. More generally, the context in which computation takes place evolves continuously – e.g. cloud computing via mobile devices – and since operating systems take a lot of effort to produce and a long time to mature, they tend to have been designed to solve old problems. Most smart phones have Linux installed, which is based on UNIX, born in 1972! 6 / 26 History 1950s The first operating systems were designed at MIT and elsewhere in the early 50s. Their purpose was to set up tapes, card readers etc., to save users a bit of time before they ran their job. The major bottleneck in the earliest computers was I/O – whilst the computer was reading from a card or writing to a tape, everything else had to wait. In the mid-50s, the notion of multiprogramming was developed: whilst some I/O was taking place, the computer may as well be doing something else, such as executing another program. 7 / 26 History 1950s Interactive process Computation Waiting for I/O device Waiting for user Waiting for CPU In a system that can only run one process at a time, CPU usage is low. 8 / 26 History 1950s P3 Computation Waiting for I/O device Waiting for user Waiting for CPU P2 P1 By loading several processes simultaneously, much better use can be made of the CPU time. We will be discussing how this is achieved using the concepts of scheduling, time slices and interrupts. 9 / 26 History 1960s Creating ambitious operating systems that extended the usefulness of computers and the ease of programming them became a major concern in the 1960s, and the first systems that we would recognise as a modern OS emerged. Two massive advances: Time sharing: enabling multiple users to interact simultaneously with the computer (using techniques and concepts similar to those required by multiprogramming). CTSS (1962) enabled three concurrent users. Virtual memory. 10 / 26 History Virtual memory In days of yore, when a program was run, the whole thing (code and data) needed to be loaded into memory, which was tiny at the time. To write a program larger than the primary memory available, the programmer needed to move sections of it in and out of memory themselves, which was time-consuming and error-prone. 11 / 26 History Virtual memory Researchers at the University of Manchester developed the concept of virtual memory, which allowed the programmer to behave as if an arbitrary amount of memory were available. To achieve this, an OS typically maps virtual addresses to primary or secondary storage. When an address that maps to secondary storage is requested, a page fault occurs, and the relevant code or data is moved into primary storage (meaning something else is moved out, with the map updated accordingly). Every modern OS uses virtual memory. We will be discussing the details later in the module. 12 / 26 History 1970s The operating systems discussed so far ran on vast mainframe computers. In the late 60s and early 70s minicomputers were developed, such as the PDP range from DEC. Minicomputers were smaller and cheaper than mainframes – even small laboratories (and universities) could afford to run and maintain one. 13 / 26 Minicomputers 14 / 26 History UNIX UNIX was developed at Bell Labs by Ken Thompson and Dennis Ritchie. One of the things that gave UNIX its long-lasting appeal is that it was implemented in an extremely useful new programming language C (also invented by Ritchie, making him undoubtedly the most influential programmer of all time). 15 / 26 History Early PC operating systems From the mid-70s enthusiasts began building computers in their own homes. The operating systems used by these personal computers were primitive, with very few of the features found in the likes of UNIX. Apple DOS 3.1, for the Apple II, supported a single user, had no multiprogramming, no access control, and no way to protect the OS from accidental overwrites by a user’s program. 16 / 26 History 1980s and early PCs Microsoft acquired QDOS (Quick and Dirty Operating System) in 1980, rebranded it as MS-DOS, and IBM started shipping it with their low cost PCs. MS-DOS became very popular and went through several versions but never supported any form of multiprogramming or virtual memory. Microsoft Windows began as an application running on top of MS-DOS and continued that way until Windows NT in 1993. Along the way, there were several improvements, such as the addition of preemptive multitasking and something close to virtual memory. 17 / 26 History 1980s In the minicomputer world, multitasking was developed – similar to multiprogramming but emphasises the concurrent execution of multiple programs associated with the same user. In cooperative multitasking individual programs pass control to one another. In preemptive multitasking the OS decides which task has priority. UNIX allows the user to give advice about this: $ nice -n 12 gzip /proxylogs/* 18 / 26 History Rise of the GUI In 1984 Apple released two new operating systems, Lisa OS and Macintosh OS, each featuring a sophisticated graphical user interface to be controlled by a mouse, “influenced” by work done at Xerox Parc. From this point onwards, users think the GUI is the OS 🙁 Lisa was sophisticated for a PC OS, with multitasking, access control and a hierarchical file system. 19 / 26 GNU/Linux By the late 80s, Richard Stallman had made great progress with the GNU project, which aimed to create a free OS similar to UNIX. Stallman ideas were most unusual – the notion of an OS which was both free and powerful was unheard of. The GNU environment consisted of file systems, various system tools, compiler, debugger, but was missing an essential component – a kernel. 20 / 26 GNU/Linux A suitable kernel was supplied by Linus Torvalds, a Finnish student, and Linux 1.0 was released in 1994. Linux has of course been a roaring success and is the dominant OS for servers, supercomputers and mobile devices. Although a modern GNU/Linux OS contains many advanced features, it bears a strong family resemblance to Thompson and Ritchie’s system. Mac OS X is also based on UNIX. Watch Revolution OS: https://www.youtube.com/watch?v=4vW62KqKJ5A 21 / 26 https://www.youtube.com/watch?v=4vW62KqKJ5A GNU/Linux A suitable kernel was supplied by Linus Torvalds, a Finnish student, and Linux 1.0 was released in 1994. Linux has of course been a roaring success and is the dominant OS for servers, supercomputers and mobile devices. Although a modern GNU/Linux OS contains many advanced features, it bears a strong family resemblance to Thompson and Ritchie’s system. Mac OS X is also based on UNIX. Watch Revolution OS: https://www.youtube.com/watch?v=4vW62KqKJ5A 21 / 26 https://www.youtube.com/watch?v=4vW62KqKJ5A Linux kernel features multitasking, multiuser, multiplatform: runs on many different CPUs, not just Intel, multiprocessor: used in several loosely-coupled MP applications, multithreading: multiple independent threads of control within a single process memory space, memory protection between processes, so that one program can’t bring the whole system down, demand loads executables: only reads from disk those parts of a program that are actually used, POSIX job control, virtual memory using paging (not swapping whole processes) to disk, dynamically linked shared libraries (DLL’s), and static libraries too, of course, standards-compliant (POSIX, System V, and BSD), shared copy-on-write pages among executables. This means that multiple process can use the same memory to run in: increases speed and decreases memory use, all source code is available, including the whole kernel and all drivers, the development tools and all user programs, multiple virtual consoles: several independent login sessions through the console, Supports several advanced filesystems, transparent access to MS-DOS partitions (or OS/2 FAT partitions) via a special filesystem, HFS (Macintosh) file system support is available separately as a module, CD-ROM filesystem which reads all standard formats of CD-ROMs, TCP/IP networking, including ftp, telnet, NFS, etc, Netware client and server, Lan Manager/Windows Native (SMB) client and server, All major networking protocols: TCP, IPv4, IPv6, etc. Info courtesy http://www.redhat.com 22 / 26 http://www.redhat.com File systems A (modern) file system should have the following characteristics: 1 device independence: the user should not be concerned with the physical device to which they are reading/writing or how their data is represented on the device, 2 efficiency: the efficiency of the I/O operation may be improved by performing operations not in the sequence in which they arrive, 23 / 26 File systems A (modern) file system should have the following characteristics: 3 error conditions treated uniformly: errors from different devices should produce uniform error conditions, 4 robustness: it should be possible to recover from error conditions without intervention from a user program (e.g. writing to a disk that is full). 24 / 26 File systems In addition, some modern file systems provide journalling, allowing recovery of files after system crash, and use clever strategies to avoid fragmentation. Limitations on the length of filenames and maximum file sizes remain but are generous enough that there is usually no problem. GFS is a specialised file system for working with very large files. 25 / 26 Motivation and history __MACOSX/Data structures assignment/._week7a Data structures assignment/week7b Terminology Context switching CI583: Data Structures and Operating Systems Basic OS Concepts 1 / 22 Terminology Context switching Outline 1 Terminology 2 Context switching 2 / 22 Terminology Context switching Terminology We need a few basic terms before we can begin, many of which you may already be familiar with: CPU: the central processing unit is the piece of hardware that executes instructions one at a time to carry out basic arithmetic, logic and input/output (IO) operations. Register: a small amount of storage inside the CPU which can hold one word (normally 32 or 64 bits). A modern CPU has access to 32, 64 or more. The fastest type of storage. The CPU moves data in and out of registers in order to carry out arithmetic operations etc. 3 / 22 Terminology Context switching Terminology Assembly code: written in low-level but still human-readable programming languages whose instructions map very closely to actual machine instructions. More or less all you can do in assembly is move values in and out of registers and the stack and call arithmetic operations. Each assembly language is specialised for particular hardware. High level languages are often compiled into this. Process: an instance of a program that is currently being executed. Each process may continue several threads, each of which may run concurrently. 4 / 22 Terminology Context switching Terminology Primary storage: also called main memory or RAM. A form of volatile data storage used to hold the processes currently being executed, parts of the operating system itself, and so on. Secondary storage: also called external storage. Non-volatile data storage typically held on a hard drive. Far slower than primary storage, since accessing a particular location requires physical moving parts. 5 / 22 Terminology Context switching Terminology Stack: or call stack. An area of main memory dedicated to store information about a process, and which makes temporary storage available to that process. Heap: an area of main memory available for use by executing processes. 6 / 22 Terminology Context switching Context switching As the OS transfers control between processes, interrupts processes to deal with an I/O device etc., it needs to keep track of the setting within which execution is taking place: the context. The context includes code and data – everything required for (say) a process to continue its work. 7 / 22 Terminology Context switching Context switching Work other than processes need contexts too. For instance, a process, P, may spawn several threads, each of which share’s P’s code and some of its data, though each thread maintains local data too. Context is switched between threads on the basis of time-slicing and the type of task a thread is carrying out – e.g. a thread may be blocked on I/O. 8 / 22 Terminology Context switching Context switching Even within a single thread we need to think about switching contexts, though this is handled by the compiler. Consider the following code in a C-like language: Compute powers 1 int main() { 2 int i; 3 int a; 4 //... 5 i = sub(a, 1); 6 //... 7 return (0); 8 } 9 10 int sub(int x, int y) { 11 int result = 1; 12 int i; 13 for(i=0; i= K . This is slow,
as we need to check all available free blocks. Leads to the
creation of many small blocks which might not be much use,
or external fragmentation.

First-fit: allocate the first free block >= K .
Counter-intuitively, this generally leads to less fragmentation.

13 / 25

I/O Dynamic Storage

Dynamic storage
The Buddy System

The Buddy System (Knuth, 1968) is a dynamic storage scheme in
which the size of blocks all blocks is some power of two.

Requests for storage are rounded up to the nearest power of two,
p. If a block of size p is free, it’s taken.

Otherwise, we take the smallest block larger than p and split it in
two – the two halves are called buddies.

14 / 25

I/O Dynamic Storage

Dynamic storage
The Buddy System

If the buddies have size p, one of them is taken.

Otherwise, we take one buddy, split it again and continue until a
block of size p is reached.

When freeing space, if the buddy of the block to be freed is free,
join them together.

If the buddy of the new block is free, join them and keep going
until the largest possible block is formed.

15 / 25

I/O Dynamic Storage

The Buddy System

64K

1. Program A requests 34K (Order 0)

512K

16 / 25

I/O Dynamic Storage

The Buddy System

64K

2. Program B requests 66K (Order 1)

512K

A B

17 / 25

I/O Dynamic Storage

The Buddy System

64K

2. Program C requests 35K (Order 0)

512K

A B

A BC

18 / 25

I/O Dynamic Storage

The Buddy System

64K

3. Program D requests 90K (Order 1)

512K

A B

A BC

A BC D

19 / 25

I/O Dynamic Storage

The Buddy System

64K

4. Program B releases its memory

512K

A BC D

20 / 25

I/O Dynamic Storage

The Buddy System

64K

5. Program D releases its memory

512K

A BC D

A BC

21 / 25

I/O Dynamic Storage

The Buddy System

64K

6. Program A releases its memory

512K

A BC

22 / 25

I/O Dynamic Storage

The Buddy System

64K

512K

7. Program C releases its memory

A BC

23 / 25

I/O Dynamic Storage

The Buddy System

Allocating and deallocating memory using the Buddy System is
fast – the system can be represented as a binary tree.

However, internal fragmentation can be high if the amount of
storage is requested is slightly larger than a small block but much
smaller than the next size up (e.g. 65K, in our example).

The Linux kernel uses the Buddy System with modifications to
reduce internal fragmentation.

24 / 25

I/O Dynamic Storage

Next week

Next week we’ll be looking at principles of OS design – monolithic
kernels versus micro-kernels, and issues around virtualisation.

25 / 25

I/O
Dynamic Storage

__MACOSX/Data structures assignment/._week7c

Data structures assignment/week4b

CI583: Data Structures and Operating Systems

Algorithmic strategies to solve NP-hard problems

1 / 15

Depth-�rst and breadth-�rst

Many NP problems can be reduced to a problem in which the
nodes in a graph must be visited exactly once � this is a more
general idea of traversal, which we used with trees.

For instance, we may need to transmit a network packet to every
computer on a network, making sure that no computer receives it
twice.

There are two ways we might do this: depth-�rst and breadth-�rst.

2 / 15

Depth-�rst and breadth-�rst

A depth-�rst traversal will go as far as possible down a given path
before it considers any other. A breadth-�rst traversal goes evenly
in many directions.

7 8 9

456

321

3 / 15

Depth-�rst traversal

7 8 9

456

321

4 / 15

Depth-�rst traversal

When we reach a dead end we go back up the path until we �nd a
node with an unvisited adjacent node. One traversal of this graph
starting at 1 reaches a dead end at 6: [1, 2, 3, 4, 7, 5, 6].

7 8 9

456

321

5 / 15

Depth-�rst traversal

We then go back to node 7 and �nd that 8 is unvisited. We visit 8
and reach a dead end.

7 8 9

456

321

6 / 15

Depth-�rst traversal

We then go back to 4 and �nd that 9 is unvisited. The next time
we go back up the path we end up at the starting node and we are
done.

7 8 9

456

321

7 / 15

Breadth-�rst traversal

7 8 9

456

321

8 / 15

Breadth-�rst traversal

7 8 9

456

321

9 / 15

Breadth-�rst traversal

7 8 9

456

321

10 / 15

Breadth-�rst traversal

7 8 9

456

321

11 / 15

Breadth-�rst traversal

7 8 9

456

321

12 / 15

Traversal

5 4 3

13 / 15

Weighted graphs

Consider the problem of �nding �cheapest� paths in a directed and
weighted graph:

5 4 3

2
5

14 / 15

Minimum spanning trees

The MST for a graph, G, is a spanning tree in which the total of
the edge weights is minimal.

We can calculate the MST by brute force � for n edges this takes
2n comparisons between potential MSTs � this is an NP problem.

15 / 15

Depth-first and breadth-first
Greedy algorithms

__MACOSX/Data structures assignment/._week4b

Data structures assignment/week8a

Processes and threads

CI583: Data Structures and Operating Systems
OS Design Principles

1 / 17

Processes and threads

Outline

1 Processes and threads

2 / 17

Processes and threads

Conceptually, the thread is a unit of computation, to be stopped
and started by the OS.

T
im

Process

Thread #1 Thread #2

3 / 17

Processes and threads

Each thread belongs to an enclosing process, which can also be
stopped and started by the OS and which is used to store context
which includes:

an address space (the set of memory locations that contains
the code and data for the program),

a list of references to open files, and

other information might be common to several threads.

4 / 17

Processes and threads

To represent a process we use a data structure called a process
control block (PCB), containing references to all the info given
above and to a list of threads.

Each thread is represented by a thread control block (TCB). The
TCB contains references to thread-specific context, including the
thread’s stack and contents of registers.

stack pointer

other registers

current state

stack pointer

other registers

current state

Address space
description

open file descriptors

list of threads

current state

PCB TCB

stacks

stack pointer

other registers

current state

5 / 17

Processes and threads

The OS needs to be able to create, delete and synchronise
processes.

In Windows and *NIX a process is either active or terminated.

Terminated processes are deleted (by a process dedicated to this
purpose) when all of its threads are terminated and there are no
more references to the process from elsewhere.

6 / 17

Processes and threads

When a process is created, the address space is loaded into
memory. On Linux, this is accomplished by the exec program.

How does the OS initialise the address space?

One approach would be to copy the code and data of the program
into the address space, but this would be inefficient.

7 / 17

Processes and threads

The code section of the program is read-only and so can be shared
by any processes executing the same program.

The parts of the data section that are never modified can also be
shared.

8 / 17

Processes and threads

A better approach is to map the executable file into the address
space.

Both the address space and the executable file are divided into
blocks of equal size called pages.

9 / 17

Processes and threads

The text regions of all processes running this executable are set up
using hardware-address translation facilities, with each process
mapping to the same location.

The data regions of each process initially refers to a single copy of
the data portion of the executable, which has been copied into
memory.

10 / 17

Processes and threads

Processes and Threads

$ find / -amin +1$ find ./ -name Main.java

Process A

TEXT

DATA

Process B

TEXT

DATA

Memory Map

A: TEXT

A: DATA

B: TEXT

B: DATA

11 / 17

Processes and threads

When a process modifies data for the first time, it is given a new,
private page containing a copy of the pristine page.

This is called a private mapping.

Modern systems also use the notion of a shared mapping: when
data is modified the original page is altered and all processes see
the change.

12 / 17

Processes and threads

$ find / -amin +1$ find ./ -name Main.java

Process A

TEXT

DATA

Process B

TEXT

DATA

Memory Map

A: TEXT

A: DATA

B: TEXT

B: DATA

13 / 17

Processes and threads

When a thread is created it is in the runnable state.

At some point it is switched into the running state by the
scheduler (which we will come back to in more detail).

A thread that is running may then be put into the waiting state,
for instance because it has initiated an I/O action and needs to
wait for the result before doing anything else.

14 / 17

Processes and threads

The thread can put itself into the waiting state, or can be moved
into the state by the OS.

If the OS uses time-slicing, whereby threads are run for a
maximum period of time before relinquishing the processor, the
scheduler can move the thread into the waiting state.

15 / 17

Processes and threads

A thread can move itself into the terminated state, or it can be
moved there by the OS.

Note that a thread must be in the running state at least once
before terminating.

Terminated threads are removed in various ways, such as by a
dedicated “reaper” process.

16 / 17

Processes and threads

Running

Waiting

TerminatedRunnable

17 / 17

Processes and threads

__MACOSX/Data structures assignment/._week8a

Data structures assignment/week9e

Journalled file systems

CI583: Data Structures and Operating Systems
Journalled File systems

1 / 16

Journalled file systems

In journalling, the steps of a transaction are written to a journal
before being committed, or written to disk.

If anything goes wrong whilst the changes are being written to
disk, a recovery procedure repeats the steps from the journal when
the system restarts.

If any change is made twice, that’s not a problem because changes
are idempotent – carrying them out n times is the same as carrying
them out once.

2 / 16

Journalled file systems

There are two ways to do this:

redo journalling (as described above), and

undo journalling, in which the old contents of data blocks are
stored in the journal, allowing the user to get back to the
previous consistent state after a crash.

3 / 16

Journalled file systems

Making every change twice (once in a journal, once for real) risks
doubling the amount of work, and journalling would not have been
practical on the systems of the 70s.

4 / 16

Journalled file systems

How big, then, should a transaction be?

Deleting a large file might require millions of operations, so that a
single task is too big for the journal.

Carrying out each small change in its own transaction would be
inefficient.

5 / 16

Journalled file systems

In practice, small changes are batched together into one
transaction and big changes are broken up into several transactions.

Some systems (e.g. ext3) use time-based transactions.

The important thing is that we respect the ACID rules and take
the system from one consistent state to another.

6 / 16

Journalled file systems

Shadow-paged file systems

Journalled file systems ensure consistency but involve rather a lot
of work. There is a simpler solution: shadow-paged file systems.

Like journalling, shadow-paged systems are only viable in a context
where memory is plentiful and processors are fast.

Examples include ZFS from Sun.

7 / 16

Journalled file systems

Shadow-paged file systems

The idea is simple – the whole file system is represented in memory
as a data structure called the shadow-page tree, the root of which
is called the überblock.

The tree contains pointers to all metadata and data blocks.

8 / 16

Journalled file systems

Shadow-paged file systems

Whenever a disk block is about to be changed a copy is made and
it is that which is modified – the original is kept unchanged.

To link the modified copy into the tree, the parent node must be
altered, but instead of changing it directly, that node is also
copied, and so on, up to the root node.

This procedure is called copy-on-write.

9 / 16

Journalled file systems

Shadow-paged file systems

The root node is modified directly in a single disk write.

If the system crashes while an update is in progress but before the
root node has been altered, the system comes back up with the old
copy of the tree.

By keeping the original root node, we have a snapshot of the
system as a whole.

10 / 16

Journalled file systems

Shadow-paged file systems

root

indirect
inode blocks

direct
inode blocks

indirect
data blocks

direct
data blocks

11 / 16

Journalled file systems

Shadow-paged file systems

root

indirect
inode blocks

direct
inode blocks

indirect
data blocks

direct
data blocks

12 / 16

Journalled file systems

Shadow-paged file systems

root

indirect
inode blocks

direct
inode blocks

indirect
data blocks

direct
data blocks

13 / 16

Journalled file systems

Shadow-paged file systems

root

indirect
inode blocks

direct
inode blocks

indirect
data blocks

direct
data blocks

14 / 16

Journalled file systems

Shadow-paged file systems

root

indirect
inode blocks

direct
inode blocks

indirect
data blocks

direct
data blocks

15 / 16

Journalled file systems

Shadow-paged file systems

root

indirect
inode blocks

direct
inode blocks

indirect
data blocks

direct
data blocks

16 / 16

Journalled file systems

__MACOSX/Data structures assignment/._week9e

Data structures assignment/week9d

Resiliency

CI583: Data Structures and Operating Systems
File systems: Resiliency

1 / 12

Resiliency

Early file systems such as S5FS performed badly in the face of
crashes and other unexpected shutdowns.

Not only could the files which were open at the time be corrupted,
but other completely unrelated files could also be damaged.

It is easy to understand how unwritten modifications to a file could
be lost, but how were other files being affected?

2 / 12

Resiliency

This happened because data structures which describe the whole
file system, such as the superblock and the I-list, could easily
become corrupted.

This is even more serious than losing the latest version of a user’s
file.

In the worst case, two separate inodes could point to the same
data, or system files could be marked as free space, etc!

3 / 12

Resiliency

The problem stems from the fact that many operations that we
think of as a “semantic unit”, such as the system call write,
actually comprise many operations.

Consider the task of adding an element to back of a queue.

In combination with caching, an interruption here can lead to the
“last” element in the queue being some random memory location.

4 / 12

Resiliency

Resiliency
Corrupting a data structure

. . .

5 / 12

Resiliency

Resiliency
Corrupting a data structure

. . .
.

6 / 12

Resiliency

Resiliency
Corrupting a data structure

. . .
.

7 / 12

Resiliency

Resiliency
Corrupting a data structure

. . .
?

8 / 12

Resiliency

If this is the case, then the data is consistent.

9 / 12

Resiliency

The measures taken by file systems to ensure consistency have
normally focused on metadata consistency, rather than the
contents of user files, although this is obviously important to the
user.

One approach is to ensure that every change to the system
(write, creat, rename, unlink) takes the system from one
consistent state to another.

10 / 12

Resiliency

A second approach is called the transactional approach.

Using this approach, we collect updates into groups called
transactions, inspired by the database world.

This group of updates can be treated as a single action with the
ACID properties.

11 / 12

Resiliency

Atomic: each transaction is all-or-nothing – either all of it
takes place or none of it does.

Consistent: the system is always consistent. None of the
inconsistent states that might exist while a transaction is
taking place are visible outside of the transaction.

Isolated: a transaction has no effect until it is committed to
disk, and is not affected by any other ongoing (uncommitted)
transactions.

Durable: once a transaction is committed, its effects persist.

The two main approaches to providing these properties are
journalling and shadow-paging, which we will look at in turn.

12 / 12

Resiliency

__MACOSX/Data structures assignment/._week9d

Data structures assignment/week4c

CI583: Data Structures and Operating Systems

Algorithmic strategies to solve NP-hard problems

1 / 23

Greedy algorithms

Greedy algorithms work by looking at a subset of a larger problem

and �nding the best solution for that subset.

Dijkstra’s Shortest Path algorithm is a greedy algorithm developed

in the late 50s. At each step it looks at adjacent edges and decides

which one to add to the spanning tree. By doing this repeatedly,

we end up with a spanning tree that has the minimum overall total,

the MST.

2 / 23

Dijkstra’s shortest path

The algorithm works by putting all nodes into one of three

categories:

1 in the tree: nodes already added to the spanning tree,

2 on the fringe of the tree: those nodes adjacent to the current

node, and

3 not yet considered.

3 / 23

Dijkstra’s shortest path

The algorithm, informally:

1 Select a starting node.

2 Build the initial fringe from nodes adjacent to

the starting node.

3 While there are nodes not yet considered, do

1 Choose the edge in the fringe with the smallest

weight.

2 Add the associated node to the tree.

3 Continue with the new selected node and updated

fringe.

4 / 23

Backtracking

The class of problems related to �nding a path through a maze or

avoiding a series of obstacles can be solved by backtracking

algorithms. This type of algorithm makes choices at each step and,

when it reaches a dead end, retraces its steps to a point at which

an alternative choice can be made.

Image c©http://www.liyenchong.com

5 / 23

http://www.liyenchong.com

Backtracking

Backtracking techniques save the state of the problem each time a

choice is made. These techniques are not only useful for

path-�nding.

Consider the N-queens problem. This problem asks how to position

N queens on an N × N chess board in such a way that they don’t
threaten each other.

6 / 23

4-queens

We can develop a state space tree to describe the 4-queens

problem. Note that no row, column or diagonal can contain more

than one queen. Each level of the state space tree represents the

possible places where each queen can be placed for one of the rows

of the board.

The state space tree will have 256 leaves, with each path from the

root to a leaf representing a possible solution. Most of these are

not solutions of course.

7 / 23

4-queens

root

1,1 1,31,2 1,4

2,1 2,2 2,3 2,4 2,1 2,2 2,3 2,4

8 / 23

4-queens

Given the state space tree, we can carry out a depth-�rst traversal

to place 4 queens on the board, then check whether they can

attach each other:

root

1,2

2,4

3,1

4,3

9 / 23

4-queens

What is wrong with this approach?

Using this approach, much more work is done than necessary. As

soon as we place a queen so that it threatens another, a path that

passes through that node will not lead to a solution and we call

that a nonpromising node. So there is no need to populate or

search subtrees with a nonpromising root.

A backtracking recursive solution to n-queens would stop recursing

whenever it reaches a nonpromising node. Note that, in this case,

the algorithm does not literally need to retrace its steps.

10 / 23

4-queens

What is wrong with this approach?

Using this approach, much more work is done than necessary. As

soon as we place a queen so that it threatens another, a path that

passes through that node will not lead to a solution and we call

that a nonpromising node. So there is no need to populate or

search subtrees with a nonpromising root.

A backtracking recursive solution to n-queens would stop recursing

whenever it reaches a nonpromising node. Note that, in this case,

the algorithm does not literally need to retrace its steps.

10 / 23

Backtracking

The general strategy:

procedure SearchSpace(i)
if there is a solution then

output the solution

else

for every possible next step do

if the step is promising then

SearchSpace(i+1)

end if

end for

end if

end procedure

11 / 23

Dynamic programming

Dynamic programming techniques include algorithms in which the

most e�cient solutions depend on choices that might change with

time.

The key feature to dynamic programming is memoisation � this is

the technique of storing the result of expensive computations in

order to reuse them.

12 / 23

Dynamic programming

This Java method calculates the nth element in the Fibonacci
sequence (0, 1, 1, 2, 3, 5, 8 . . .):

1 int fibonacci(int n) {

Demo

8 / 9

Efficient directories

Next time

Memory management: virtual memory, page tables, etc.

9 / 9

Efficient directories

__MACOSX/Data structures assignment/._week9f

Data structures assignment/week8b

Managing hardware Shared libraries

CI583: Data Structures and Operating Systems
OS Design Principles

1 / 21

Managing hardware Shared libraries

Outline

1 Managing hardware

2 Shared libraries

2 / 21

Managing hardware Shared libraries

Managing hardware

In an earlier lecture we discussed device drivers, which contain the
code that knows how to interact with a given device.

3 / 21

Managing hardware Shared libraries

Managing hardware

In early UNIX systems the kernel had hard-coded support for the
relevant devices. In order to add a new device, the user needed to
recompile the kernel image.

4 / 21

Managing hardware Shared libraries

Managing hardware

Special files were created on the file system in the /dev directory
to refer to devices.

These “special” files refer to the device using its device number.
Thus, when an application opens the “file” /dev/sda1 the kernel
uses the right device and driver.

5 / 21

Managing hardware Shared libraries

Managing hardware

This approach was laborious and inadequate for the number of
devices that can be used with today’s systems.

The modern approach uses what we might call meta-drivers, each
responsible for the class of devices that can use a particular bus,

So, the USB driver knows about USB devices and probes the
system at start-up time to see which are attached.

6 / 21

Managing hardware Shared libraries

Managing hardware

The correct drivers and kernel modules are then loaded to initialise
these devices.

Whilst a modern Linux system is running, the udev “daemon”
listen for the connection and removal of devices and updates the
contents of /dev accordingly.

Most of what udev does is in userland (doesn’t require special
privileges).

7 / 21

Managing hardware Shared libraries

Interacting with a device

We begin by considering how the OS might interact with a very
simple device – a terminal, which takes user input at the command
line and displays the results in a simple text-only UI.

Although this might seem like
an obsolete example, the way
that the OS reads from and
writes to this device is
something that can be adapted
for many contexts.

8 / 21

Managing hardware Shared libraries

Interacting with a device

We can see straight away that it’s not enough to read one
character at a time from the device, or to write data straight to
the device. Why not?

1 Data may be sent to the device faster than they can be
processed, or generated by the user faster than the application
can accept it.

9 / 21

Managing hardware Shared libraries

Interacting with a device

2 Chars may arrive from the keyboard when there is no waiting
read request.

10 / 21

Managing hardware Shared libraries

Interacting with a device

3 Input may need to be processed in some way before they read
the application.

For instance, chars need to be echoed to the screen so that
the user can see what they are typing.

Chars may be grouped into lines which can be edited before
submitting by pressing enter.

The terminal may support tab-completion.

11 / 21

Managing hardware Shared libraries

Interacting with a device

Items 1 and 2 are examples of the producer-consumer problem, in
which two processes need to synchronise their access to a finite
queue or buffer.

The producer’s job is to generate a piece of data, put it into the
buffer and start again.

12 / 21

Managing hardware Shared libraries

Interacting with a device

At the same time, the consumer is consuming the data (i.e.,
removing it from the buffer) one piece at a time.

The problem is to make sure that the producer won’t try to add
data into the buffer if it’s full and that the consumer won’t try to
remove data from an empty buffer.

13 / 21

Managing hardware Shared libraries

Interacting with a device

We separate this common functionality into something called a line
discipline module.

14 / 21

Managing hardware Shared libraries

Shared libraries

We have already mentioned the fact that each OS has an API and
that high-level languages provide convenient calls to the API in
standard libraries.

As well as wrappers for the API, standard libraries contain large
numbers of standard functions for convenience.

15 / 21

Managing hardware Shared libraries

Shared libraries

These shared libraries are called dynamic-linked libraries on
Windows and shared objects on Linux.

One advantage of using them is that they need not be loaded until
needed, improving the start-up time of a program.

16 / 21

Managing hardware Shared libraries

Shared libraries

The biggest advantage is code reuse – few programmers would
attempt GUI programming without a library.

17 / 21

Managing hardware Shared libraries

Shared libraries

When a program is converted into machine code that can be
executed directly by the OS, it needs to contain everything it needs
to run.

When we include a call to a shared function in our program, such
as a call to the C function printf (or, more indirectly,
System.out.println in Java), a compiler could take one of
several choices:

18 / 21

Managing hardware Shared libraries

Shared libraries

Approaches to shared libraries:

1 Include a copy of printf with our program. This would make
our programs much larger than they need to be and be a
waste of storage.

2 Load a copy of printf at runtime, along with our program.
There would only be one copy on disk but each program would
load it into memory: again, this would be very inefficient.

19 / 21

Managing hardware Shared libraries

Shared libraries

Approaches to shared libraries:

20 / 21

Managing hardware Shared libraries

After the break

Storage, virtual memory, virtualisation, scheduling.

21 / 21

Managing hardware
Shared libraries

__MACOSX/Data structures assignment/._week8b

Data structures assignment/week8c

Scheduling

CI583: Data Structures and Operating Systems
Scheduling

1 / 20

Scheduling

We have said that the main purpose of an OS is to manage
resources, and one of the ways of doing this is to ensure that
processes and threads have access to a “fair” amount of processor
time.

2 / 20

Scheduling

giving priority to interactive threads,

giving priority to system-level threads,

maximising the number of threads processed per unit of time,

a combination of the above.

3 / 20

Scheduling

Scheduling strategies

The strategy we use will depend on the type of system we are
designing:

1 Simple batch systems (SBS): Each process runs to completion
and the job of the scheduler is to pick the next task to be run.

The two main considerations are system throughput and
average wait time.

4 / 20

Scheduling

Scheduling strategies

2 Multiprogrammed batch systems: Same as SBS but several
jobs are run concurrently.

The considerations from the SBS are supplemented with the
questions of how many jobs should be running at any one
time and how the processor time is apportioned amongst
running jobs.

5 / 20

Scheduling

Scheduling strategies

3 Time-sharing systems: here the main question becomes
apportioning time to the jobs that are ready to execute – the
runnable threads.

The main concern is response time – the time from when a
command is given to when it is completed. Short requests
should be completed quickly.

6 / 20

Scheduling

Scheduling strategies

4 Shared servers: A single computer being used by many clients.

The question arises here of giving each client a reasonable
apportionment of time, instead of focusing only on runnable
threads.

7 / 20

Scheduling

Scheduling strategies

5 Real-time systems: Can be soft or hard.

An example of soft real-time would be video processing – data
must be processed in a strictly synchronised fashion and as
efficiently as possible, but some lag is not a disaster.

An example of hard real-time would be autonomous vehicle
software that alters the direction of a car – certain commands
must be handled in a timely way.

8 / 20

Scheduling

Simple Batch Systems

On an SBS, jobs are run one at a time and, when a job ends, we
can choose between several queued jobs to decide which to run
next.

There are two approaches we could take:

1 First-in-first-out (FIFO): jobs are executed in the order they
were submitted to the system.

2 Shortest-job-first (SJF): some (relatively crude) measure is
used to decide how long a job is going to take, and the
shortest will be executed first.

9 / 20

Scheduling

Simple Batch Systems

FIFO might seem fairest, but if we take average throughput as the
measure of effectiveness for our scheduler, then SJF will perform
best.

However, without further measures, SJF could mean that a very
long job is queued indefinitely, so we need to make sure that
doesn’t happen.

10 / 20

Scheduling

Multiprogrammed Batch Systems

An MBS is essentially the same as SBS except that several jobs are
held in memory at one time and time-slicing is used to switch
focus between the currently active job.

A solution which gets round some of the problem with SJF is to
hold multiple queues of relatively short and relatively long jobs, so
a short and a long job will be held in memory at any one time.

11 / 20

Scheduling

Time-Sharing Systems

A TSS is one which has several users logged on at any one time.

The main concern of the scheduler is that the system should feel
responsive. A job which ought to be short should be short.

12 / 20

Scheduling

Time-Sharing Systems

If a job that normally takes 3 or 4 minutes, such as compiling some
code, takes 5 minutes, users probably won’t mind.

But if a job that should be very quick, such as opening a file in a
word processor, takes 1 minute then the users will think the system
is slow.

13 / 20

Scheduling

Time-Sharing Systems

A scheduling strategy for a TSS should favour short and interactive
operations at the possible expense of longer ones.

As a first attempt, consider a time-sliced scheduler that uses a
round robin strategy. When a thread uses up its time quantum, it
is moved to the back of the queue and the next thread gets to run.

14 / 20

Scheduling

Time-Sharing Systems

SHORT

SHORT/
INTERACTIVE

LONG

15 / 20

Scheduling

Time-Sharing Systems

Determining the level of interactivity of a process will necessarily
involve some guesswork.

A more effective algorithm is to reduce the priority of a thread
each time it uses a time quantum.

16 / 20

Scheduling

Time-Sharing Systems

The first time a thread runs, it does so at the highest priority level
and, presuming it isn’t completed, its priority is reduced.

The next time it runs, it’s priority is reduced again, and so on.
This is called a multilevel feedback queue.

In this system, threads in low priority queue can only run if there
are no threads in a higher queue.

17 / 20

Scheduling

Time-Sharing Systems

HIGH

LOW

18 / 20

Scheduling

Time-Sharing Systems

However, we can improve on this by taking other factors than the
number of time quanta consumed into account.

19 / 20

Scheduling

Time-Sharing Systems

We can sum this up by saying a thread’s priority gets worse while it
is running, and better while it is waiting.

This is the strategy used by the schedulers in Windows and Linux
today.

At the same time, each OS allows the user some way of stating
how important they consider a process and its threads to be (on
Linux, this is the nice command).

20 / 20

Scheduling

__MACOSX/Data structures assignment/._week8c

Data structures assignment/week5d

CI583: Data Structures and Operating Systems

Compression algorithms

1 / 25

Outline

1 Compression

2 Run-length encoding

3 Hu�man coding

2 / 25

Data, data everywhere!

Almost all data generated is metadata produced automatically.

The global data volume is estimated to have exceeded 64
zettabytes in 2020.

3 / 25

Data, data everywhere!

It’s hard to imagine how much data is contained in 64 zettabytes…

1 zettabyte = 1000 exabyte = 1 million terabytes = 1 trillion GB

It has been estimated that all human words ever spoken could be
encoded into 5 exabytes1.

1
New York Times: http://tx0.org/5ls

4 / 25

http://tx0.org/5ls

Compression algorithms

The �eld that studies this is called Information Theory, combining
mathematics and computer science, and founded by Claude
Shannon with a series of landmark works in the 1940s and 50s.

A compression algorithm can be lossless (information-preserving) or
lossy (some information considered non-essential is discarded).

5 / 25

Run-length encoding

WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWW

WWBWWWWWWWWWWWWWW

this has the following RLE:

12 W1B12W3B24W1B14W

6 / 25

Optimising RLE

RLE algorithms are simple and can be extremely e�ective especially
for data with little variation, where savings of up to 90% are not
unusual.

7 / 25

Burrows-Wheeler Transform

This transformation is used in the bzip compression format, widely
used on UNIX systems.

8 / 25

Burrows-Wheeler Transform

Form the set of rotations of the input. We rotate a word by moving
one character from the beginning to end.

9 / 25

Burrows-Wheeler Transform

^billing$

billing$^

illing$^b

lling$^bi

ling$^bil

ing$^bill

ng$^billi

g$^billin

$^billing

// addressradix sets the radix in which numerical
values are displayed
// 2 is the default value
// addressradix
addressradix 16

// numpages sets the number of pages (physical and
virtual)
// 64 is the default value
// numpages must be at least 2 and no more than 64
// numpages
numpages 64

The output file contains one line per operation executed. The format of each line is:

command address … status

where:

• command is READ or WRITE,
• address is a number corresponding to a virtual memory address, and
• status is okay or page fault.

Sample Output

Suggested Exercises

__MACOSX/Data structures assignment/._MemoryManagementSimulator

Data structures assignment/.DS_Store

__MACOSX/Data structures assignment/._.DS_Store

Data structures assignment/week4d

CI583: Data Structures and Operating Systems

Recursion, and some recursive problems

1 / 22

Outline

1 Recursion

2 The Towers of Hanoi

2 / 22

Recursion

Recursion occurs when an algorithm (or method, or function), A,
requires us to execute A again, repeatedly.

So that the execution of A does not diverge (run forever) we need
to de�ne conditions under which the recursion will end.

We call this the base case: the conditions under which the
recursion continues are called the recursive case.

Recursion is closely linked to mathematical induction.

3 / 22

Recursion

Any iterative algorithm can be expressed via recursion, and vice
versa.

There are (Turing complete) programming languages which don’t
include any construct for looping, such as Haskell.

Sometimes an iterative solution is the clearest, but recursive code is
often more concise and expresses the solution to the problem
elegantly and directly.

Recursion and iteration are more general constructs than the
algorithmic strategies we considered in the last lecture (greedy
algorithms, dynamic programming, etc).

4 / 22

Recursion

We have seen recursion in OO style, when traversing tree structures:

1 class Node extends Tree {

2 void traverse () {

3 left.traverse ();

4 this.visit ();

5 right.traverse ();

6 }

7 }

8 class Leaf extends Tree {

9 void traverse () {

10 this.visit();

11 }

12 }

5 / 22

Recursion

1 int triangle(int n) {

2 if (n == 1) return 1;

3 else return n + triangle(n – 1);

4 }

6 / 22

Recursion

1 triangle (5)

2 if (5==1) 1 else 5 + triangle (5-1)

3 if (false) 1 else 5 + triangle (5-1)

4 5 + triangle (4)

5 5 + (if (4==1) 1 else 4 + triangle (4-1))

6 5 + (4 + (if (3==1) 1 else 3 + triangle (3-1)))

7 5 + (4 + (3 + (if (2==1) 1 else 2 + triangle (2-1))))

8 5 + (4 + (3 + (2 + (if (1==1) 1 else 1 + triangle (1-1)

))))

9 5 + (4 + (3 + (2 + 1)))

10 15

7 / 22

Recursion

The solution to this is to make triangle tail-recursive, meaning
that the recursive call is the last thing the method does.

In this way, we don’t need to keep intermediate values hanging
around.

Fortunately, every non-tail-recursive algorithm can be rewritten to a
tail-recursive version.

Unfortunately, the resulting code is often much less intuitive.

8 / 22

Recursion

In terms of the algorithmic strategies from the last lecture,
recursion is particular important in divide-and-conquer strategies.

These are strategies in which we divide the problem into two parts
to solved separately (or discard one of them).

An example would be Binary Search, though the algorithm we gave
in week 1 was iterative.

We will look at two e�cient sorting algorithms that use a
divide-and-conquer approach, MergeSort and QuickSort.

9 / 22

The Towers of Hanoi

10 / 22

The Towers of Hanoi

We will call the set of all discs, in their right order, the stack.
(Nothing to do with the data structure.)

A B C

11 / 22

The Towers of Hanoi

4 3

A B C

12 / 22

The Towers of Hanoi

A recursive solution to move n discs from a source tower, S, to a
destination tower, D, via an intermediate tower, I:

1 Move the substack of n − 1 discs from S to I.
2 Move the largest disc from S to D.

3 Move the substack from I to D.

(Recursive de�nitions feel like cheating sometimes!) When we
begin, n = 4, S = A, I = B and D = C.

13 / 22

The Towers of Hanoi

A B C

14 / 22

The Towers of Hanoi

4 3

A B C

15 / 22

The Towers of Hanoi

A B C

16 / 22

The Towers of Hanoi

A B C

17 / 22

The Towers of Hanoi

But we can’t move the substack of discs [1,2,3] all at once. Still,
it’s easier than moving 4 discs…

The problem is getting smaller…

18 / 22

The Towers of Hanoi

How do we move the substack [1,2] from A to C?

This one is quite easy: move 1 to B, then 2 to C, then 1 to C.

This is the base case: note that we have said exactly what we will
do without hand-waving like �move the substack…�

19 / 22

The Towers of Hanoi

A recursive solution in Java:

1 class TowersApp {

2 static int nDiscs = 3;

4 public static void main(String [] args) {

5 doTowers(nDiscs , ‘A’, ‘B’, ‘C’);

6 }

7 //…

8 }

20 / 22

The Towers of Hanoi

A recursive solution in Java:

1 class TowersApp {

2 //…

3 public static void doTowers(int n, char from , char

inter , char to) {

4 if (n==1) {

5 System.out.printf(“Disc 1 from %c to %c\n”, from

, to);

6 } else {

7 doTowers(n-1, from , to, inter);//swap from and

inter

8 System.out.printf(“Disc %d from %c to %c\n”, n,

from , to);

9 doTowers(n-1, inter , from , to);//swap inter and

10 }

11 }

12 }

21 / 22

The Towers of Hanoi

It seems astonishing at �rst that a problem that seems like it should
be quite complicated can be solved with so little code!

This is certainly a case of the recursive solution being elegant and
concise.

We can of course produce an iterative solution to the Towers of
Hanoi problem, but it is longer and less clear (to my mind) what is
going on.

22 / 22

Recursion
The Towers of Hanoi

__MACOSX/Data structures assignment/._week4d

Data structures assignment/week9c

Physical media for external storage Optimisations

CI583: Data Structures and Operating Systems
File systems: improvements over the early

systems

1 / 15

Physical media for external storage Optimisations

Outline

1 Physical media for external storage

2 Optimisations

2 / 15

Physical media for external storage Optimisations

Media

So far, we have been ignoring the issue of the actual media that
data blocks are stored on.

We have been assuming that any block of data can be retrieved
with the same cost (O(1)), as if we were accessing locations in a
big array.

For a solid-state drive (SSD) this is more or less true.

3 / 15

Physical media for external storage Optimisations

Media

Image copyright http://www.file-recovery.com/

4 / 15

http://www.file-recovery.com/

Physical media for external storage Optimisations

Disk architecture

A typical disk drive consists of several platters each of which has
one or two recording surfaces.

Image copyright http://www.file-recovery.com/

5 / 15

http://www.file-recovery.com/

Physical media for external storage Optimisations

Disk architecture

Each surface is divided into a number of concentric tracks, and
each track is divided into a number of sectors. All the sectors are
the same size, and outer tracks have more sectors.

6 / 15

Physical media for external storage Optimisations

Disk architecture

7 / 15

Physical media for external storage Optimisations

Disk architecture

8 / 15

Physical media for external storage Optimisations

Disk architecture

In order to read or write to a location on disk, the disk controller
does the following:

1 Move the heads over the correct cylinder. This is the seek
time.

2 Rotate the platter until the desired sector is under the head.
This is the rotational latency.

3 Rotate the platter so that the entire sector passes under the
head, in order to read or write data. This is the transfer time.

9 / 15

Physical media for external storage Optimisations

Disk architecture

Rotational latency depends on the rate at which the disk spins
(e.g., 10,000 RPM), and transfer time depends on the spin rate
and the number of sectors per track.

Average seek time is usually the most important factor, in the low
milliseconds in a typical drive – a very long time compared to the
speed at which processors work.

An efficient file system has to take steps to reduce this.

10 / 15

Physical media for external storage Optimisations

Optimisations

With regard to storage media, the two key optimisations the
designer of a file system can make are to reduce seek time and
increase the amount of data transferred.

A straightforward way to reduce seek time is to use buffering.

11 / 15

Physical media for external storage Optimisations

Optimisations

This will improve latency for reads but not writes.

12 / 15

Physical media for external storage Optimisations

Optimisations

With regard to storage media, the two key optimisations the
designer of a file system can make are to reduce seek time and
increase the amount of data transferred.

Other improvements (used in file systems such as Linux ext2):

1 Increased block size. Helpful, but complex data allocation
strategies are need to avoid excessive fragmentation (see
Doeppner, section 6.1).

2 Reduce seek time by data allocation strategies. Allocate the
next block in a way that takes disk architecture into account –
use as few cylinders as possible.

13 / 15

Physical media for external storage Optimisations

Optimisations

3 Reduce rotational latency by data allocation strategies.
Allocate blocks so that they can be read without repositioning
the heads.

14 / 15

Physical media for external storage Optimisations

Optimisations

4 Clustering. Allocate blocks in groups, rather than one by one.
ext2 pre-allocates 8 blocks at a time, eventually giving them
up if they are not used or space becomes short.

15 / 15

Physical media for external storage
Optimisations

__MACOSX/Data structures assignment/._week9c

Data structures assignment/memory.zip

memory/.DS_Store

__MACOSX/memory/._.DS_Store

memory/Common.java

public

class

Common

{

static

public

long
s2l
(

String
s
)

{

long
i
=

0
;

try

{

i
=

Long
.
parseLong
(
s
.
trim
());

}

catch

(
NumberFormatException
nfe
)

{

System
.
out
.
println
(
“NumberFormatException: ”

+
nfe
.
getMessage
());

}

return
i
;

}

static

public

int
s2i
(

String
s
)

{

int
i
=

0
;

try

{

i
=

Integer
.
parseInt
(
s
.
trim
());

}

catch

(
NumberFormatException
nfe
)

{

System
.
out
.
println
(
“NumberFormatException: ”

+
nfe
.
getMessage
());

}

return
i
;

}

static

public

byte
s2b
(

String
s
)

{

int
i
=

0
;

byte
b
=

0
;

try

{

i
=

Integer
.
parseInt
(
s
.
trim
());

}

catch

(
NumberFormatException
nfe
)

{

System
.
out
.
println
(
“NumberFormatException: ”

+
nfe
.
getMessage
());

}

b
=

(
byte
)
i
;

return
b
;

}

public

static

long
randomLong
(

long
MAX
)

{

long
i
=

–
1
;

java
.
util
.
Random
generator
=

new

java
.
util
.
Random
(
System
.
currentTimeMillis
());

{

System
.
out
.
println
(
“MemoryManagement: numpages out of bounds.”
);

System
.
exit
(
–
1
);

}

address_limit
=

(
block
*
virtPageNum
+
1
)
–
1
;

}

in
.
close
();

}

catch

(
IOException
e
)

{

/* Handle exceptions */

}

{

System
.
out
.
println
(
“MemoryManagement: Invalid page value in ”

+
config
);

System
.
exit
(
–
1
);

}

R
=

Common
.
s2b
(
st
.
nextToken
());

if

(
R
< 0 || R >

1
)

{

System
.
out
.
println
(
“MemoryManagement: Invalid R value in ”

+
config
);

System
.
exit
(
–
1
);

}

M
=

Common
.
s2b
(
st
.
nextToken
());

if

(
M
< 0 || M >

1
)

{

System
.
out
.
println
(
“MemoryManagement: Invalid M value in ”

+
config
);

System
.
exit
(
–
1
);

}

inMemTime
=

Common
.
s2i
(
st
.
nextToken
());

{

System
.
out
.
println
(
“MemoryManagement: pagesize is out of bounds”
);

System
.
exit
(
–
1
);

}

{

System
.
out
.
println
(
“MemoryManagement: addressradix out of bounds.”
);

System
.
exit
(
–
1
);

}

in
.
close
();

}

catch

(
IOException
e
)

{

/* Handle exceptions */

}

}

f
=

new

File

(
commands
);

try

{

DataInputStream
in
=

new

DataInputStream
(
new

FileInputStream
(
f
));

while

((
line
=
in
.
readLine
())

!=

null
)

{

if

(
line
.
startsWith
(
“READ”
)

||
line
.
startsWith
(
“WRITE”
))

{

if

(
line
.
startsWith
(
“READ”
))

{

command
=

“READ”
;

}

if

(
line
.
startsWith
(
“WRITE”
))

{

command
=

“WRITE”
;

}

StringTokenizer
st
=

new

StringTokenizer
(
line
);

tmp
=
st
.
nextToken
();

if

(
tmp
.
startsWith
(
“random”
))

{

instructVector
.
addElement
(
new

Instruction
(
command
,
Common
.
randomLong
(
address_limit
)));

}

else

{

if

(
tmp
.
startsWith
(

“bin”

)

)

{

addr
=

Long
.
parseLong
(
st
.
nextToken
(),
2
);

}

else

if

(
tmp
.
startsWith
(

“oct”

)

)

{

addr
=

Long
.
parseLong
(
st
.
nextToken
(),
8
);

}

else

if

(
tmp
.
startsWith
(

“hex”

)

)

{

addr
=

Long
.
parseLong
(
st
.
nextToken
(),
16
);

}

else

{

addr
=

Long
.
parseLong
(
tmp
);

}

if

(
0

>
addr
||
addr
>
address_limit
)

{

System
.
out
.
println
(
“MemoryManagement: ”

+
addr
+

“, Address out of range in ”

+
commands
);

System
.
exit
(
–
1
);

}

instructVector
.
addElement
(
new

Instruction
(
command
,
addr
));

}

in
.
close
();

}

catch

(
IOException
e
)

{

/* Handle exceptions */

}

runcycles
=
instructVector
.
size
();

{

physical_count
++
;

}

if

(
physical_count
>

1
)

{

System
.
out
.
println
(
“MemoryManagement: Duplicate physical page’s in ”

+
config
);

System
.
exit
(
–
1
);

}

physical_count
=

0
;

}

{

System
.
out
.
println
(
“MemoryManagement: Instruction (”

+
instruct
.
inst
+

” ”

+
instruct
.
addr
+

“) out of bounds.”
);

System
.
exit
(
–
1
);

}

public

void
setControlPanel
(
ControlPanel
newControlPanel
)

{

controlPanel
=
newControlPanel
;

}

public

void
getPage
(
int
pageNum
)

{

Page
page
=

(

Page

)
memVector
.
elementAt
(
pageNum
);

controlPanel
.
paintPage
(
page
);

}

private

void
printLogFile
(
String
message
)

{

String
line
;

String
temp
=

“”
;

File
trace
=

new

File
(
output
);

if

(
trace
.
exists
())

{

try

{

DataInputStream
in
=

new

DataInputStream
(

new

FileInputStream
(
output
)

);

while

((
line
=
in
.
readLine
())

!=

null
)

{

temp
=
temp
+
line
+
lineSeparator
;

}

in
.
close
();

}

catch

(

IOException
e
)

{

/* Do nothing */

}

try

{

PrintStream
out
=

new

PrintStream
(

new

FileOutputStream
(
output
)

);

out
.
print
(
temp
);

out
.
print
(
message
);

out
.
close
();

}

catch

(
IOException
e
)

{

/* Do nothing */

}

public

void
run
()

{

step
();

while

(
runs
!=
runcycles
)

{

try

{

Thread
.
sleep
(
2000
);

}

catch
(
InterruptedException
e
)

{

/* Do nothing */

}

step
();

}

public

void
step
()

{

int
i
=

0
;

Instruction
instruct
=

(

Instruction

)
instructVector
.
elementAt
(
runs
);

controlPanel
.
instructionValueLabel
.
setText
(
instruct
.
inst
);

controlPanel
.
addressValueLabel
.
setText
(

Long
.
toString
(
instruct
.
addr
,
addressradix
)

);

getPage
(

Virtual2Physical
.
pageNum
(
instruct
.
addr
,
virtPageNum
,
block
)

);

if

(
controlPanel
.
pageFaultValueLabel
.
getText
()

==

“YES”

)

{

controlPanel
.
pageFaultValueLabel
.
setText
(

“NO”

);

}

if

(
instruct
.
inst
.
startsWith
(

“READ”

)

)

{

Page
page
=

(

Page

)
memVector
.
elementAt
(

Virtual2Physical
.
pageNum
(
instruct
.
addr
,
virtPageNum
,
block
)

);

if

(
page
.
physical
==

–
1

)

{

if

(
doFileLog
)

{

printLogFile
(

“READ ”

+

Long
.
toString
(
instruct
.
addr
,
addressradix
)

+

” … page fault”

);

}

if

(
doStdoutLog
)

{

System
.
out
.
println
(

“READ ”

+

Long
.
toString
(
instruct
.
addr
,
addressradix
)

+

” … page fault”

);

}

PageFault
.
replacePage
(
memVector
,
virtPageNum
,

Virtual2Physical
.
pageNum
(
instruct
.
addr
,
virtPageNum
,
block
)

,
controlPanel
);

controlPanel
.
pageFaultValueLabel
.
setText
(

“YES”

);

}

else

{

page
.
R
=

1
;

page
.
lastTouchTime
=

0
;

if

(
doFileLog
)

{

printLogFile
(

“READ ”

+

Long
.
toString
(
instruct
.
addr
,
addressradix
)

+

” … okay”

);

}

if

(
doStdoutLog
)

{

System
.
out
.
println
(

“READ ”

+

Long
.
toString
(
instruct
.
addr
,
addressradix
)

+

” … okay”

);

}

if

(
instruct
.
inst
.
startsWith
(

“WRITE”

)

)

{

Page
page
=

(

Page

)
memVector
.
elementAt
(

Virtual2Physical
.
pageNum
(
instruct
.
addr
,
virtPageNum
,
block
)

);

if

(
page
.
physical
==

–
1

)

{

if

(
doFileLog
)

{

printLogFile
(

“WRITE ”

+

Long
.
toString
(
instruct
.
addr
,
addressradix
)

+

” … page fault”

);

}

if

(
doStdoutLog
)

{

System
.
out
.
println
(

“WRITE ”

+

Long
.
toString
(
instruct
.
addr
,
addressradix
)

+

” … page fault”

);

}

else

{

page
.
M
=

1
;

page
.
lastTouchTime
=

0
;

if

(
doFileLog
)

{

printLogFile
(

“WRITE ”

+

Long
.
toString
(
instruct
.
addr
,
addressradix
)

+

” … okay”

);

}

if

(
doStdoutLog
)

{

System
.
out
.
println
(

“WRITE ”

+

Long
.
toString
(
instruct
.
addr
,
addressradix
)

+

” … okay”

);

}

memory/MemoryManagement.java

// The main MemoryManagement program

import
java
.
applet
.
*
;

import
java
.
awt
.
*
;

import
java
.
io
.
*
;

import
java
.
util
.
*
;

//import ControlPanel;

//import PageFault;

//import Virtual2Physical;

//import Common;

//import Page;

public

class

MemoryManagement

{

public

static

void
main
(
String
[]
args
)

{

ControlPanel
controlPanel
;

Kernel
kernel
;

if

(
args
.
length
< 1 || args . length >

2

)

{

System
.
out
.
println
(

“Usage: ‘java MemoryManagement ‘”

);

System
.
exit
(

–
1

);

}

File
f
=

new

File
(
args
[
0
]

);

if

(

!

(
f
.
exists
()

)

)

{

System
.
out
.
println
(

“MemoryM: error, file ‘”

+
f
.
getName
()

+

“‘ does not exist.”

);

System
.
exit
(

–
1

);

}

if

(

!

(
f
.
canRead
()

)

)

{

System
.
out
.
println
(

“MemoryM: error, read of ”

+
f
.
getName
()

+

” failed.”

);

System
.
exit
(

–
1

);

}

if

(
args
.
length
==

2

)

{

f
=

new

File
(
args
[
1
]

);

if

(

!

(
f
.
exists
()

)

)

{

System
.
out
.
println
(

“MemoryM: error, file ‘”

+
f
.
getName
()

+

“‘ does not exist.”

);

System
.
exit
(

–
1

);

}

if

(

!

(
f
.
canRead
()

)

)

{

System
.
out
.
println
(

“MemoryM: error, read of ”

+
f
.
getName
()

+

” failed.”

);

System
.
exit
(

–
1

);

}

kernel
=

new

Kernel
();

controlPanel
=

new

ControlPanel
(

“Memory Management”

);

if

(
args
.
length
==

1

)

{

controlPanel
.
init
(
kernel
,
args
[
0
]

,

null

);

}

else

{

controlPanel
.
init
(
kernel
,
args
[
0
]

,
args
[
1
]

);

}

__MACOSX/memory/._MemoryManagement.java

memory/Page.java

public

class

Page

{

public

int
id
;

public

int
physical
;

public

byte
R
;

public

byte
M
;

public

int
inMemTime
;

public

int
lastTouchTime
;

public

long
high
;

public

long
low
;

public

Page
(

int
id
,

int
physical
,

byte
R
,

byte
M
,

int
inMemTime
,

int
lastTouchTime
,

long
high
,

long
low
)

{

this
.
id
=
id
;

this
.
physical
=
physical
;

this
.
R
=
R
;

this
.
M
=
M
;

this
.
inMemTime
=
inMemTime
;

this
.
lastTouchTime
=
lastTouchTime
;

this
.
high
=
high
;

this
.
low
=
low
;

}

memory/PageFault.java

/* It is in this file, specifically the replacePage function that will

be called by MemoryManagement when there is a page fault. The

users of this program should rewrite PageFault to implement the

page replacement algorithm.

// This PageFault file is an example of the FIFO Page Replacement Algorithm.

import
java
.
util
.
*
;

//import Page;

public

class

PageFault

{

/**

* The page replacement algorithm for the memory management sumulator.

* This method gets called whenever a page needs to be replaced.

* The page replacement algorithm included with the simulator is

* FIFO (first-in first-out). A while or for loop should be used

* to search through the current memory contents for a canidate

* replacement page. In the case of FIFO the while loop is used

* to find the proper page while making sure that virtPageNum is

* not exceeded.


   *   Page page = ( Page ) mem.elementAt( oldestPage )

   *

* This line brings the contents of the Page at oldestPage (a

* specified integer) from the mem vector into the page object.

* Next recall the contents of the target page, replacePageNum.

* Set the physical memory address of the page to be added equal

* to the page to be removed.


   *   controlPanel.removePhysicalPage( oldestPage )

   *

* Once a page is removed from memory it must also be reflected

* graphically. This line does so by removing the physical page

* at the oldestPage value. The page which will be added into

* memory must also be displayed through the addPhysicalPage

* function call. One must also remember to reset the values of

* the page which has just been removed from memory.

*
@param
mem is the vector which contains the contents of the pages

* in memory being simulated. mem should be searched to find the

* proper page to remove, and modified to reflect any changes.

*
@param
virtPageNum is the number of virtual pages in the

* simulator (set in Kernel.java).

*
@param
replacePageNum is the requested page which caused the

* page fault.

*
@param
controlPanel represents the graphical element of the

* simulator, and allows one to modify the current display.

public

static

void
replacePage
(

Vector
mem
,

int
virtPageNum
,

int
replacePageNum
,

ControlPanel
controlPanel
)

{

int
count
=

0
;

int
oldestPage
=

–
1
;

int
oldestTime
=

0
;

int
firstPage
=

–
1
;

int
map_count
=

0
;

boolean
mapped
=

false
;

while

(

!

(
mapped
)

||
count
!=
virtPageNum
)

{

Page
page
=

(

Page

)
mem
.
elementAt
(
count
);

if

(
page
.
physical
!=

–
1

)

{

if

(
firstPage
==

–
1
)

{

firstPage
=
count
;

}

if

(
page
.
inMemTime
>
oldestTime
)

{

oldestTime
=
page
.
inMemTime
;

oldestPage
=
count
;

mapped
=

true
;

}

count
++
;

if

(
count
==
virtPageNum
)

{

mapped
=

true
;

}

if

(
oldestPage
==

–
1
)

{

oldestPage
=
firstPage
;

}

Page
page
=

(

Page

)
mem
.
elementAt
(
oldestPage
);

Page
nextpage
=

(

Page

)
mem
.
elementAt
(
replacePageNum
);

controlPanel
.
removePhysicalPage
(
oldestPage
);

nextpage
.
physical
=
page
.
physical
;

controlPanel
.
addPhysicalPage
(
nextpage
.
physical
,
replacePageNum
);

page
.
inMemTime
=

0
;

page
.
lastTouchTime
=

0
;

page
.
R
=

0
;

page
.
M
=

0
;

page
.
physical
=

–
1
;

}

__MACOSX/memory/._PageFault.java

memory/tracefile
READ 4 … okay
READ 13 … okay
WRITE cc32 … okay
READ 4000 … okay
READ 4000 … okay
WRITE 6001 … okay
WRITE 43bad … okay

memory/user_guide.html

MOSS
Memory Management Simulator
User Guide

Purpose

Introduction

Running the Simulator

The program reads a command file, optionally reads
a configuration file, displays a GUI window which
allows you to execute the command file, and optionally
writes a trace file.

To run the program, enter the following command line.

$ java MemoryManagement commands memory.conf

The buttons:

Button
Description

run
runs the simulation to completion. Note that the simulation
pauses and updates the screen between each step.

step
runs a single setup of the simulation and updates the display.

reset
initializes the simulator and starts from the beginning of
the command file.

exit
exits the simulation.

page n
display information about this virtual page in the display
area at the right.

The informational display:

Field
Description

status:
RUN, STEP, or STOP. This indicates whether the current
run or step is completed.

time:
number of “ns” since the start of the simulation.

instruction:
READ or WRITE. The operation last performed.

address:
the virtual memory address of the operation last performed.

page fault:
whether the last operation caused a page fault to occur.

virtual page:
the number of the virtual page being displayed in the
fields below. This is the last virtual page accessed by the simulator,
or the last page n button pressed.

physical page:
the physical page for this virtual page, if any. -1
indicates that no physical page is associated with this virtual page.

R:
whether this page has been read. (1=yes, 0=no)

M:
whether this page has been modified. (1=yes, 0=no)

inMemTime:
number of ns ago the physical page was allocated to this virtual
page.

lastTouchTime:
number of ns ago the physical page was last modified.

low:
low virtual memory address of the virtual page.

high:
high virtual memory address of the virtual page.

The Command File

Operations on Virtual Memory

There are two operations one can carry out on pages in memory:
READ and WRITE.

The format for each command is

operation address

or
operation random

For example, the sequence

READ bin 01010101
WRITE bin 10101010
READ random
WRITE random

causes the virtual memory manager to:
read from virtual memory address 85

write to virtual memory address 170

read from some random virtual memory address

write to some random virtual memory address

Sample Command File

The “commands” input file looks like this:

// Enter READ/WRITE commands into this file
// READ
// WRITE
READ bin 100
READ 19
WRITE hex CC32
READ bin 100000000000000
READ bin 100000000000000
WRITE bin 110000000000001
WRITE random

The Configuration File

The memset command is used to initialize each
entry in the virtual page map.
memset is followed by six integer values:

The virtual page # to initialize

The physical page # associated with this virtual page
(-1 if no page assigned)

If the page has been read from (R) (0=no, 1=yes)

If the page has been modified (M) (0=no, 1=yes)

The amount of time the page has been in memory (in ns)

The last time the page has been modified (in ns)

The first two parameters define the mapping between
the virtual page and a physical page, if any.
The last four parameters are values that might be used
by a page replacement algorithm.
For example,

memset 34 23 0 0 0 0

specifies that virtual page 34 maps to physical page 23,
and that the page has not been read or modified.
Note:

Each physical page should be mapped to exactly one virtual page.

The number of virtual pages is fixed at 64 (0..63).

The number of physical pages cannot exceed 64 (0..63).

If a virtual page is not specified by any memset command,
it is assumed that the page is not mapped.

Other Configuration File Options

There are a number of other options which can
be specified in the configuration file. These are
summarized in the table below.

Keyword Values Description

addressradix n
The radix in which numerical values are displayed.
The default radix is 2 (binary). You may prefer radix
8 (octal), 10 (decimal), or 16 (hexadecimal).

Sample Configuration File

The “memory.conf” configuration file looks like this:

The Output File

The output file contains one line per operation
executed. The format of each line is:

command address … status

where:
command is READ or WRITE,

address is a number corresponding to a virtual memory address, and

status is okay or page fault.

Sample Output

The output “tracefile” looks something like this:

READ 4 … okay
READ 13 … okay
WRITE 3acc32 … okay
READ 10000000 … okay
READ 10000000 … okay
WRITE c0001000 … page fault
WRITE 2aeea2ef … okay

Suggested Exercises

Modify replacePage() in
PageFault.java to implement a round robin
page replacement algorithm
(i.e., first page fault
replaces page 0, next one replaces page 1, next one replaces page 2,
etc.).

Modify replacePage() in
PageFault.java to implement a least recently used
(LRU) page replacement algorithm.

To Do

Please send suggestions, corrections, and comments to
Ray Ontko
(rayo@ontko.com).

Last updated: July 28, 2001

memory/user_guide_1.gif

memory/Virtual2Physical.java

import
java
.
util
.
Vector
;

public

class

Virtual2Physical

{

public

static

int
pageNum
(

long
memaddr
,

int
numpages
,

long
block
)

{

int
i
=

0
;

long
high
=

0
;

long
low
=

0
;

2 / 20

Hash tables

Values are stored in an array, to which keys provide an index. Cells
in the array are often called buckets.

In order to translate a key (which might be a string, numeric or
other value) into an array index, a hash function is used.

The hash function is a
constant time function, as is
array access, so the hash table
provides O(1) performance for
insertion, deletion and search.

However, this is only true if the
hash table is not more than
about two-thirds full, for
reasons that will be explained.

Image c©http://en.wikipedia.org/

3 / 20

http://en.wikipedia.org/

background

1 class Employee {

2 int empNumber;

3 String surname;

4 String forename;

5 //…

6 }

The employee numbers range from 1 to 1000 and there is no need
for deletions � if an employee leaves, we want to keep their records
on �le.

4 / 20

Motivation

In this case it’s perfectly reasonable to store Employee objects in
an array and use employee numbers as keys.

Whenever we run out of space, we can create a new, larger array
and copy the old records across. In this way we get the constant
time performance of an array.

However, very few keys are as well-behaved as this one.

These keys run sequentially and there will be no deletions, meaning
that we don’t need to worry about fragmentation. The maximum
employee number is a reasonable size for an array.

5 / 20

Motivation

What if we want to use a string as the key? This might be to store
employee records by National Insurance number, or to store a
dictionary of words and de�nitions.

Say we want to store the contents of a 50,000 word dictionary in an
array. Each de�nition occupies its own cell and we can look it up
without searching through the whole array if we know the index.

What we need is a way of converting a string into an appropriate
index number.

6 / 20

Hash functions

We know that there are various encodings used to map characters
to numbers, such as ASCII in which a=97, b=98 and so on.

How can we use this to encode a sequence of characters?

7 / 20

Hash functions

A simplistic approach would be to sum the code numbers for each
character. To encode the words �cats� we have

c=3,

a=1,

t=20, and

s=19.

Thus cats=43. If we restrict ourselves to 10-letter words the
largest code is that for �zzzzzzzzzz�: 260.

8 / 20

Hash functions

So, we can only store 260 unique values, but our dictionary
contains 50,000 word/de�nition pairs.

We could store a list of the de�nitions whose key have the same
encoding in each cell � �tin�, �give�, �tend� and hundreds of other
words all map to 43 in our encoding.

On average, each entry will contain a list of 192 de�nitions
(50, 000/260) and we have lost the constant time performance.

9 / 20

Hash functions

Our array was too small before, so we need to make it bigger…

At the other end of the scale, we could store each de�nition in its
own cell. So we need to map strings to unique indices.

Recall that we can think of a decimal number in terms of powers of
10. E.g.,

86, 2486 = 8×105 +6×104 +2×103 +4×102 +8×101 +6×100.

10 / 20

Hash functions

Similarly, we can break down a word into individual character codes
and multiply them by powers of 27 (the radix of our encoding):

key(”cats”) = 3 × 273 + 1 × 272 + 20 × 271 + 19 × 270.

This does indeed give us a unique key for every string.

11 / 20

Hash functions

The other problem is that most of the array will be empty � the
scheme assigns an index to any combination of 10 characters, most
of which aren’t English words.

12 / 20

Hash functions

We need to compress a huge range of numbers into one that will �t
in a reasonably sized array.

We have 50,000 words to store, so you might presume we need an
array with 50,000 elements, although we will actually want about
twice that amount, as will be come clear.

We can do this using the modulus operator, %:

index = hugeNumber(key) % arraySize.

This is an example of a hash function. A perfect hash maps distinct
keys to unique locations, but this is normally not possible.

13 / 20

Hash functions

In the case of our dictionary, max1 is more than 7 trillion and max2
is 100,000. Numbers in the big range will over�ow numeric types,
but we will deal with that later.

Several elements in the big
range may map to a single
element in the smaller range
and some elements in the small
range have nothing that maps
to them.

On average, we have one entry
for every two cells.

max1

…

max2

…

14 / 20

Handling collisions

There are several ways we could respond when a collision occurs.
The �rst, called open addressing is to search in some systematic
way for an empty location and store the new value there.

The second approach would be to store a collection of entries at
each index: this is called separate chaining, and makes it clear why
each index in the array is sometimes called a bucket.

15 / 20

Open addressing

The simplest way to handle open addressing is called linear probing:
If the location that we want to store a value in is occupied, we
search sequentially for the next unoccupied location.

16 / 20

Open addressing

If we want to search for �cats� later on, we will �rst look at the
index we get by hashing the key: 54321.

If it were unoccupied, then the search simply failed. In this case,
the cell is occupied, but with the wrong data.

This should make it clear that we need to store the key as well as
any other data.

when ripe.

17 / 20

Open addressing

So we are actually storing word/de�nition pairs and we can see that
the value stored at 54321 isn’t the right one.

We search forward sequentially � this is the linear probe. As soon
as we come to an empty cell, we know the search has failed.

18 / 20

Open addressing

Searching like this makes deletions problematic: say �kumquat� also
maps to 54321 and was inserted after �banana� but before �cats�,
so that it occupies index 54322.

If we later delete �kumquat� we need to store a special value there,
some sort of dummy Nil item.

Then the probe for �cats� does not stop when it encounters an
empty cell.

The search for the location for a new entry can use one that
contains this special value.

19 / 20

Open addressing

If there are a lot of deletions the table will contain lots of dummy
Nil items, making searching less e�cient. The number of items
searched to retrieve or insert an item is called the probe length.

Many hash table implementations don’t allow deletions for this
reason. Similarly, duplicates are normally disallowed and inserting a
value with an existing key just overwrites the existing value.

So, our insert algorithm will now probe for the �rst empty location,
starting with the result of the hash function for the key, but will
accept a location that already contains the key.

20 / 20

Hash tables
Hash functions
Handling collisions

__MACOSX/Data structures assignment/._week5a

Data structures assignment/week5c

CI583: Data Structures and Operating Systems

Implementing hashtables � prime numbers and

hash functions

1 / 13

A note on prime numbers

Prime numbers, those which no number divides evenly except 1 and

the number itself, are an essential part of many algorithms.

We often need to generate a new prime number, and some

algorithms call for very large ones. Unfortunately, most methods for

testing primality are O(2n).

A naive way to test primality:

1 boolean isPrime(int n) {

2 for(int i=2; i=0; i–) {

5 int c = key.charAt(i) – 96;

6 hashVal += pow27 * c;

7 pow27 *= 27;

8 }

9 return hashVal % arraySize;

10 }

6 / 13

Implementing a hash function

Note the presence of the constants 27 and 96. Why?

1 int hash1(String key) {

2 int hashVal = 0;

3 int pow27 = 1;//1, 27, 27*27, etc

4 for(int i=key.length () -1; i>=0; i–) {

5 int c = key.charAt(i) – 96;

6 hashVal += pow27 * c;

7 pow27 *= 27;

8 }

9 return hashVal % arraySize;

10 }

7 / 13

Implementing a hash function

We can improve on this. The �rst insight is to use Horner’s

Method: see (p41, Cormen).

This formula allows us to transform a monomial (polynomial with

only one term) expression into a computationally e�cient form:

vn4 + wn3 + xn2 + yn1 + zn0 = (((vn + w)n + x)n + y)n + z.

8 / 13

Implementing a hash function

Applying this insight we rewrite the method, this time starting with

the leftmost character:

1 int hash2(String key) {

2 int hashVal = key.charAt (0) – 96;

3 for(int i=0; i e′, then their
places are swapped. We continue in this way until we reach the
end of the collection, which is the end of the first pass.

5 / 20

Bubble sort

After the first pass, the last element in the collection is the largest
one, and that position is sorted.

6 / 20

Bubble sort

One pass of bubble sort:

23 7 5 31 9 18 4 32 6

7 / 20

Bubble sort

One pass of bubble sort:

237 5 31 9 18 4 32 6

7 / 20

Bubble sort

One pass of bubble sort:

237 5 31 9 18 4 32 6

7 / 20

Bubble sort

One pass of bubble sort:

237 5 31 9 18 4 32 6

7 / 20

Bubble sort

One pass of bubble sort:

237 5 319 18 4 32 6

7 / 20

Bubble sort

One pass of bubble sort:

237 5 319 18 4 32 6

7 / 20

Bubble sort

One pass of bubble sort:

237 5 319 18 4 32 6

7 / 20

Bubble sort

One pass of bubble sort:

237 5 319 18 4 32 6

7 / 20

Bubble sort

One pass of bubble sort:

237 5 319 18 4 326

7 / 20

Bubble sort

One pass of bubble sort:

237 5 319 18 4 326

7 / 20

Bubble sort

(This is more low-level than pseudo-code should normally be, but I
think it’s the clearest way to present the algorithm.)

procedure BubbleSort(arr, n) . n is the length of arr
swapped . boolean – did we make any swaps this pass?
for i from n − 1 down to 0 do . loop backwards

swapped ← false
for j from 0 up to i do . loop forwards

if arr[j] > arr[j + 1] then
swap(arr[j], arr[j + 1])
swapped ← true

end if
end for
if swapped = false then

break
end if

end for
end procedure

8 / 20

Bubble sort
Complexity

On the first pass, bubble sort carries out n − 1 comparisons. In the
best case, there are no swaps and the algorithm terminates. We
will concentrate on the worst case.

9 / 20

Bubble sort
Complexity

So, the first pass does n− 1 comparisons, the second n− 2, and so
on.

Let W (n) be the worst case for n elements. Then

W (n) =
n−1∑
i=1

=
(n − 1)n

=
n2 −n

≈
1

2
n2

= O(n2)

10 / 20

Bubble sort
Complexity

As a rule of thumb, note that that any bubble sort implementation
will have inner and outer loops over the same collection, or some
equivalent structure.

When we see this pattern we can take it to mean O(n2), as we are
carrying out an O(n) task n times. The simplest sorting algorithms
are all in this order.

11 / 20

Selection sort

12 / 20

Selection sort

One pass of selection sort:

23 7 5 31 9 18 4 32 6

min

13 / 20

Selection sort

One pass of selection sort:

23 7 5 31 9 18 4 32 6

min

13 / 20

Selection sort

One pass of selection sort:

23 7 5 31 9 18 4 32 6

min

13 / 20

Selection sort

One pass of selection sort:

23 7 5 31 9 18 4 32 6

min

13 / 20

Selection sort

One pass of selection sort:

23 7 5 31 9 18 4 32 6

min

13 / 20

Selection sort

One pass of selection sort:

23 7 5 31 9 18 4 32 6

min

13 / 20

Selection sort

One pass of selection sort:

23 7 5 31 9 18 4 32 6

min

13 / 20

Selection sort

One pass of selection sort:

23 7 5 31 9 18 4 32 6

min

13 / 20

Selection sort

One pass of selection sort:

237 5 31 9 184 32 6

13 / 20

Selection sort

One pass of selection sort:

237 5 31 9 184 32 6

min

13 / 20

Selection sort

procedure SelectionSort(arr, n) . n is the length of arr
for out from 0 up to n − 2 do . stop at one before the last

element
min ← out
for in from out + 1 up to n − 1 do

if arr[min] > arr[in] then
min ← in

end if
end for
swap(out, min)

end for
end procedure

14 / 20

Selection sort
Complexity

In the worst case, we can see that selection sort is O(n2), by the
same reasoning as for bubble sort.

Still, making fewer swaps gives selection sort better average
performance.

15 / 20

Insertion sort

Most of the time, insertion sort has the best performance of the
simple sorts we’re looking at today. It is still O(n2) but about
twice as fast as bubble sort.

The idea is to start by sorting the first two elements then, as we
move along the collection, to insert each element in the right place
in the sorted part of the collection.

16 / 20

Insertion sort

So, part of the input is always sorted and we keep inserting items
into that part. The sorted part grows and the unsorted part shrinks
until there is nothing left to do.

We need to keep track of which part of the collection is sorted,
and we need to store temporary values as we make room for an
element to be moved.

17 / 20

Insertion sort

Part of a run of insertion sort:

23 7 5 31 9 18 4 32 6

unsorted

temp

18 / 20

Insertion sort

Part of a run of insertion sort:

23 75 31 9 18 4 32 6

unsorted

temp

18 / 20

Insertion sort

Part of a run of insertion sort:

237 5 31 9 18 4 32 6

unsorted

temp

18 / 20

Insertion sort

Part of a run of insertion sort:

237 531 9 18 4 32 6

unsorted

temp

18 / 20

Insertion sort

Part of a run of insertion sort:

237 531 9 18 4 32 6

unsorted

temp

18 / 20

Insertion sort

Part of a run of insertion sort:

237 531 9 18 4 32 6

unsorted

temp

18 / 20

Insertion sort

Part of a run of insertion sort:

237 531 9 18 4 32 6

unsorted

temp

18 / 20

Insertion sort

Part of a run of insertion sort:

2375 31 9 18 4 32 6

unsorted

temp

18 / 20

Insertion sort

Part of a run of insertion sort:

2375 31 9 18 4 32 6

unsorted

temp

18 / 20

Insertion sort

You will probably have to think about this and compare it to the
applet to convince yourself that it’s true.

procedure Insertion Sort(arr, n) . n is the length of arr
for out from 1 up to n − 1 do . out is the dividing line

tmp ← arr[out]
in ← out . start shifts at out
while in > 0 and arr[in − 1] > tmp do

arr[in] ← arr[in − 1]
in ← in − 1

end while
arr[in] ← tmp

end for
end procedure

19 / 20

Insertion sort
complexity

The previous two sorts reduce the size of the problem at each pass.
In this case we increase it. On the first pass, we make one
comparison, on the second, a maximum of two, and so on.

1 + 2 + · · · + n − 1 =
n(n − 1)

So the worst case is the same: O(n2). However, insertion sort
performs much better when the data is sorted or “almost” sorted.

20 / 20

Simple sorting algorithms
Bubble sort
Selection sort
Insertion sort

__MACOSX/Data structures assignment/._week2a

Data structures assignment/week2b

CI583: Data Structures and Operating Systems

A sorting algorithm which is impressively efficient…

1 / 11

Radix sort

2 / 11

Radix sort

Consider sorting the following data:

[ 310, 213, 023, 130, 013, 301, 222, 032, 201, 111, 323, 002, 330,
102, 231, 120 ].

Note that some elements are padded so that all elements have the
same number of digits.

3 / 11

Radix sort

We start by collecting elements with the same first digit.

Bucket Contents
0 310 130 330 120
1 301 201 111 231
2 222 032 002 102
3 213 023 013 323

Emptying the buckets gives us a new list:

[ 310, 130, 330, 120, 301, 201, 111, 231, 222, 032, 002, 102, 213,
023, 013, 323 ].

4 / 11

Radix sort

We start by collecting elements with the same first digit.

Bucket Contents
0 310 130 330 120
1 301 201 111 231
2 222 032 002 102
3 213 023 013 323

Emptying the buckets gives us a new list:

[ 310, 130, 330, 120, 301, 201, 111, 231, 222, 032, 002, 102, 213,
023, 013, 323 ].

4 / 11

Radix sort

Using the new list, we collect elements with the same second digit.

Bucket Contents
0 301 201 002 201
1 310 111 213 013
2 120 222 023 323
3 130 330 231 032

Emptying the buckets again:

[ 301, 201, 002, 201, 310, 111, 213, 013, 120, 222, 023, 323, 130,
330, 231, 032 ].

5 / 11

Radix sort

Using the new list, we collect elements with the same second digit.

Bucket Contents
0 301 201 002 201
1 310 111 213 013
2 120 222 023 323
3 130 330 231 032

Emptying the buckets again:

[ 301, 201, 002, 201, 310, 111, 213, 013, 120, 222, 023, 323, 130,
330, 231, 032 ].

5 / 11

Radix sort

Finally, we collect elements with the same third digit.

Bucket Contents
0 002 013 023 032
1 102 111 120 130
2 201 213 222 231
3 301 310 323 330

This time, emptying the buckets will give us a sorted list.

6 / 11

Radix sort

Radix sort has the air of a card trick about it, but it actually
corresponds to how people sort things (such as socks:
http://tx0.org/5c3) in real life.

Sticking to computing, we can use radix sort on data with other
kinds of keys too, such as strings (using 26 buckets or 52 for a
case-sensitive sort).

7 / 11

http://tx0.org/5c3

Radix sort

Generally, we need as many buckets as the number base or radix of
the input. We need to inspect each element k times, where k is
the number of digits in the biggest element.

k will be relatively small compared to n (e.g. when k = 6, we
could have almost a million unique records). So the steps required
is in the order O(kn), or just O(n).

8 / 11

Radix sort

The algorithm can be stated very elegantly:

procedure radixSort(list, n) . Sort list having n elements
with base 10.

shift ← 1
for loop = 1 to keySize do

for entry = 1 to n do
bucketNum ← (list[entry]/shift) % 10
append(buckets[bucketNum], list[entry])

end for
list ← combineBuckets()
shift ← shift ∗ 10

end for
end procedure

9 / 11

Radix sort

Since radix sort works in linear time, why do we even bother with
other algorithms?

Thus, we need Rn additional storage, where R is the radix (what
could we do to reduce memory usage?). Also, each element will be
moved 2k times.

10 / 11

Next time

More basic data structures: stacks, queues and priority queues.

11 / 11

Radix sort

__MACOSX/Data structures assignment/._week2b

Data structures assignment/deadlock.zip

deadlock/.DS_Store

__MACOSX/deadlock/._.DS_Store

deadlock/Command.java

public

class

Command

{

private

String
keyword
=

null

;

private

int
parameter
=

0

;

public

Command
()

{

super
();

}

public

Command
(

String
newKeyword
,

int
newParameter
)

{

super
();

setKeyword
(
newKeyword
)

;

setParameter
(
newParameter
)

;

}

public

String
getKeyword
()

{

return
keyword
;

}

public

int
getParameter
()

{

return
parameter
;

}

public

void
setKeyword
(

String
newKeyword
)

{

keyword
=
newKeyword
;

}

public

void
setParameter
(

int
newParameter
)

{

parameter
=
newParameter
;

}

public

String
toString
(

)

{

return

(
“Command[keyword=”
+
keyword
+
“,parameter=”
+
parameter
+
“]”

)

;

}

__MACOSX/deadlock/._Command.java

deadlock/CommandParser.java

import
java
.
io
.
*
;

public

class

CommandParser

{

private

StreamTokenizer
in
;

private

InputStream
inputStream
;

public

CommandParser
(

InputStream
newInputStream
)

{

super
()

;

inputStream
=
newInputStream
;

in
=

new

StreamTokenizer
(
inputStream
)

;

in
.
eolIsSignificant
(

true

)

;

in
.
ordinaryChar
(
‘/’
);

in
.
slashSlashComments
(

true

)

;

in
.
slashStarComments
(

true

)

;

}

public

void
close
()

throws

IOException

{

inputStream
.
close
()

;

}

public

Command
getCommand
()

{

String
commandLetter
=

null

;

int
commandNumber
=

0

;

try

{

int
t
;

int
state
=

0

;

boolean
looping
=

true

;

while

(
looping
)

{

t
=
in
.
nextToken
()

;

if

(
t
==

StreamTokenizer
.
TT_EOF
)

{

if

(
state
!=

0

)

throw

new

Exception
(

“unexpected text on line ”

+
in
.
lineno
()

)

;

else

return

null

;

}

switch

(
state
)

{

case

0
:
// expect letter

if

(
t
==

StreamTokenizer
.
TT_WORD
)

{

if

(
in
.
sval
.
equals
(

“C”

)

||

in
.
sval
.
equals
(

“R”

)

||

in
.
sval
.
equals
(

“F”

)

)

{

commandLetter
=
in
.
sval
;

state
=

1

;

}

else

if

(
in
.
sval
.
equals
(

“H”

)

)

{

commandLetter
=
in
.
sval
;

state
=

2

;

}

else

throw

new

Exception
(

“C,R,F, or H expected at start of line ”

+
in
.
lineno
()

)

;

}

else

if

(
t
==

StreamTokenizer
.
TT_EOL
)

;

// do nothing

else

throw

new

Exception
(

“C,R,F, or H expected at start of line ”

+
in
.
lineno
()

)

;

break

;

case

1
:

// expect parameter

if

(
t
==

StreamTokenizer
.
TT_NUMBER
)

{

commandNumber
=

(
int
)
in
.
nval
;

state
=

2

;

}

else

throw

new

Exception
(

“missing numeric value after command on line ”

+
in
.
lineno
());

break

;

case

2
:

// expect EOL

if

(
t
==

StreamTokenizer
.
TT_EOL
)

{

state
=

0

;

looping
=

false

;

}

else

throw

new

Exception
(

“unexpected text after command online ”

+
in
.
lineno
());

break

;

}

// System.out.println( “t ” + t + ” state ” + state + ” sval ” + in.sval + ” nval ” + in.nval ) ;

}

catch
(

IOException
e
)

{

System
.
out
.
println
(

“IOException ”

+
e
)

;

}

catch

(

Exception
e
)

{

System
.
out
.
println
(
e
.
toString
()

)

;

}

return

new

Command
(
commandLetter
,
commandNumber
)

;

}

public

static

void
main
(

String
args
[]

)

{

String
f
=
args
[
0
]

;

System
.
out
.
println
(

“filename ”

+
f
)

;

try

{

CommandParser
cp
=

new

CommandParser
(

new

BufferedInputStream
(
new

FileInputStream
(
f
))

);

while

(

true

)

{

Command
command
=
cp
.
getCommand
()

;

if

(
command
==

null

)

break

;

}

cp
.
close
()

;

}

catch

(

IOException
e
)

{

System
.
out
.
println
(

“IOException ”

+
e
)

;

}

__MACOSX/deadlock/._CommandParser.java

deadlock/ControlPanel.java

import
java
.
applet
.
*

;

import
java
.
awt
.
*

;

public

class

ControlPanel

extends

Frame

{

boolean
running
=

false

;

boolean
hasBeenReset
=

false

;

Kernel
kernel
;

Panel
buttonPanel
;

Button
runButton
;

Button
stopButton
;

Button
stepButton
;

Button
resetButton
;

Button
optionsButton
;

Button
processesButton
;

Button
resourcesButton
;

Button
exitButton
;

Panel
timePanel
;

Label
timeLabel
;

Label
timeValueLabel
;

TextField
timeTextField
;

Panel
topPanel
;

ProcessesPanel
processesPanel
;

ResourcesPanel
resourcesPanel
;

OptionsDialog
optionsDialog
;

ProcessesDialog
processesDialog
;

ResourcesDialog
resourcesDialog
;

public

ControlPanel
()

{

super
()

;

}

public

ControlPanel
(
String
title
)

{

super
(
title
)

;

}

public

void
init
(
Kernel
useKernel
)

{

kernel
=
useKernel
;

kernel
.
setControlPanel
(

this

)

;

runButton
=

new

Button
(

“run”

)

;

stopButton
=

new

Button
(

“stop”

)

;

stepButton
=

new

Button
(

“step”

)

;

resetButton
=

new

Button
(

“reset”

)

;

optionsButton
=

new

Button
(

“options”

)

;

processesButton
=

new

Button
(

“processes”

)

;

resourcesButton
=

new

Button
(

“resources”

)

;

exitButton
=

new

Button
(

“exit”

)

;

buttonPanel
=

new

Panel
(

)

;

buttonPanel
.
add
(
runButton
)

;

buttonPanel
.
add
(
stopButton
)

;

buttonPanel
.
add
(
stepButton
)

;

buttonPanel
.
add
(
resetButton
)

;

buttonPanel
.
add
(
optionsButton
)

;

buttonPanel
.
add
(
processesButton
)

;

buttonPanel
.
add
(
resourcesButton
)

;

buttonPanel
.
add
(
exitButton
)

;

timeLabel
=

new

Label
(

“Time: ”

,

Label
.
RIGHT
)

;

timeValueLabel
=

new

Label
(

Integer
.
toString
(
kernel
.
getTime
()

)

,

Label
.
LEFT
)

;

timeValueLabel
.
setForeground
(

Color
.
white
)

;

timeValueLabel
.
setBackground
(

Color
.
black
)

;

// timeTextField = new TextField( Integer.toString( kernel.getTime() ), 6) ;

// timeTextField.setEditable(false) ;

// timeTextField.enable(false);

// timeTextField.setForeground( Color.white ) ;

// timeTextField.setBackground( Color.black ) ;

timePanel
=

new

Panel
(

)

;

timePanel
.
add
(
timeLabel
)

;

timePanel
.
add
(
timeValueLabel
)

;

// timePanel.add( timeTextField ) ;

processesPanel
=

new

ProcessesPanel
(
kernel
.
getProcessCount
());

processesPanel
.
setProcesses
(
kernel
.
getProcesses
()

)

;

processesPanel
.
init
();

resourcesPanel
=

new

ResourcesPanel
(
kernel
.
getResourceCount
());

resourcesPanel
.
setResources
(
kernel
.
getResources
()

)

;

resourcesPanel
.
init
();

optionsDialog
=

new

OptionsDialog
(

this

,

“optionsDialog”

,

true

)

;

processesDialog
=

new

ProcessesDialog
(

this

,

“processesDialog”

,

true

)

;

processesDialog
.
setProcesses
(
kernel
.
getProcesses
());

resourcesDialog
=

new

ResourcesDialog
(

this

,

“resourcesDialog”

,

true

)

;

resourcesDialog
.
setResources
(
kernel
.
getResources
());

stopButton
.
disable
();

runButton
.
requestFocus
()

;

topPanel
=

new

Panel
()

;

topPanel
.
setLayout
(

new

BorderLayout
()

)

;

topPanel
.
add
(

“North”

,
buttonPanel
)

;

topPanel
.
add
(

“South”

,
timePanel
)

;

GridBagLayout
gbl
=

new

GridBagLayout
();

GridBagConstraints
gbc
=

new

GridBagConstraints
()

;

gbc
.
gridx
=

1

;

gbc
.
gridy
=

1

;

gbc
.
gridwidth
=

2

;

gbc
.
gridheight
=

1

;

gbl
.
setConstraints
(
topPanel
,
gbc
)

;

gbc
.
gridx
=

1

;

gbc
.
gridy
=

2

;

gbc
.
gridwidth
=

1

;

gbc
.
gridheight
=

1

;

gbc
.
anchor
=

GridBagConstraints
.
NORTH
;

gbl
.
setConstraints
(
processesPanel
,
gbc
)

;

gbc
.
gridx
=

2

;

gbc
.
gridy
=

2

;

gbc
.
gridwidth
=

1

;

gbc
.
gridheight
=

1

;

gbc
.
anchor
=

GridBagConstraints
.
NORTH
;

gbl
.
setConstraints
(
resourcesPanel
,
gbc
)

;

add
(
topPanel
)

;

add
(
processesPanel
);

add
(
resourcesPanel
);

setLayout
(
gbl
)

;

kernel
.
reset
();

hasBeenReset
=

true

;

pack
()

;

Dimension
screenSize
=
getToolkit
().
getScreenSize
()

;

Dimension
size
=
getSize
()

;

setLocation
(

(
screenSize
.
width
–
size
.
width
+

1

)

/

2

,

(
screenSize
.
height
–
size
.
height
+

1

)

/

2

)

;

show
()

;

requestFocus
()

;

}

public

boolean
action
(

Event
e
,

Object
arg
)

{

if

(
e
.
target
==
runButton
)

{

if

(

!
hasBeenReset
)

{

kernel
.
reset
()

;

hasBeenReset
=

true

;

}

runButton
.
disable
()

;

stopButton
.
enable
()

;

stepButton
.
disable
()

;

resetButton
.
disable
()

;

optionsButton
.
disable
()

;

processesButton
.
disable
()

;

resourcesButton
.
disable
()

;

stopButton
.
requestFocus
()

;

kernel
.
setStepping
(
false
);

kernel
.
resume
()

;

running
=

true

;

return

true

;

}

else

if

(
e
.
target
==
stopButton
)

{

stopAction
()

;

kernel
.
suspend
();

return

true

;

}

else

if

(
e
.
target
==
stepButton
)

{

if

(

!
hasBeenReset
)

{

kernel
.
reset
()

;

hasBeenReset
=

true

;

}

kernel
.
setStepping
(
true
);

kernel
.
resume
()

;

return

true

;

}

else

if

(
e
.
target
==
resetButton
)

{

kernel
.
reset
();

hasBeenReset
=

true

;

return

true

;

}

else

if

(
e
.
target
==
optionsButton
)

{

optionsDialog
.
setProcessCount
(
kernel
.
getProcessCount
()

)

;

optionsDialog
.
setResourceCount
(
kernel
.
getResourceCount
()

)

;

optionsDialog
.
setSleepTime
(
kernel
.
getSleepTime
()

)

;

optionsDialog
.
show
();

kernel
.
setProcessCount
(
optionsDialog
.
getProcessCount
(

)

)

;

kernel
.
setResourceCount
(
optionsDialog
.
getResourceCount
()

)

;

kernel
.
setSleepTime
(
optionsDialog
.
getSleepTime
()

)

;

processesPanel
.
setProcessCount
(
optionsDialog
.
getProcessCount
()

)

;

resourcesPanel
.
setResourceCount
(
optionsDialog
.
getResourceCount
()

)

;

hasBeenReset
=

false

;

return

true

;

}

else

if

(
e
.
target
==
processesButton
)

{

processesDialog
.
setProcessCount
(
kernel
.
getProcessCount
()

)

;

processesDialog
.
show
()

;

processesPanel
.
show
()

;

hasBeenReset
=

false

;

return

true

;

}

else

if

(
e
.
target
==
resourcesButton
)

{

resourcesDialog
.
setResourceCount
(
kernel
.
getResourceCount
()

)

;

resourcesDialog
.
show
()

;

resourcesPanel
.
show
()

;

hasBeenReset
=

false

;

return

true

;

}

else

if

(
e
.
target
==
exitButton
)

{

kernel
.
stop
();

System
.
exit
(
0
);

return

true

;

}

else

{

System
.
out
.
println
(
e
.
toString
()

)

;

return

false

;

}

public

void
stopAction
()

{

runButton
.
enable
()

;

stopButton
.
disable
()

;

stepButton
.
enable
()

;

resetButton
.
enable
()

;

optionsButton
.
enable
()

;

processesButton
.
enable
()

;

resourcesButton
.
enable
()

;

runButton
.
requestFocus
()

;

running
=

false

;

}

public

void
setTime
(

int
newTime
)

{

Dimension
oldSize
=
timeValueLabel
.
getSize
()

;

timeValueLabel
.
setText
(

Integer
.
toString
(
newTime
)

)

;

Dimension
newSize
=
timeValueLabel
.
getMinimumSize
()

;

if

(
newSize
.
width
>
oldSize
.
width
)

timeValueLabel
.
invalidate
();

}

public

void
setProcessId
(

int
i
,

int
newId
)

{

processesPanel
.
setProcessId
(
i
,
newId
)

;

}

public

void
setProcessState
(

int
i
,

String
newState
)

{

processesPanel
.
setProcessState
(
i
,
newState
)

;

}

public

void
setProcessResource
(

int
i
,

String
newResource
)

{

processesPanel
.
setProcessResource
(
i
,
newResource
)

;

}

public

void
setResourceId
(

int
i
,

int
newId
)

{

resourcesPanel
.
setResourceId
(
i
,
newId
)

;

}

public

void
setResourceAvailable
(

int
i
,

int
newAvailable
)

{

resourcesPanel
.
setResourceAvailable
(
i
,
newAvailable
)

;

}

__MACOSX/deadlock/._ControlPanel.java

deadlock/DatFilenameFilter.java

import
java
.
io
.
File

;

import
java
.
io
.
FilenameFilter

;

public

class

DatFilenameFilter

implements

FilenameFilter

{

public

DatFilenameFilter
()

{

super
()

;

}

public

boolean
accept
(

File
dir
,

String
name
)

{

return
name
.
endsWith
(
“.dat”
);

}

deadlock/deadlock.java

public

class
deadlock

{

/**

This is main method that runs the application. Any number of arguments may be

specified on the command line, but none are required. The first

argument is the number of processes to create, while the subsequent

arguments are the number of each resource is initially available.

public

static

void
main
(

String
args
[]

)

{

ControlPanel
controlPanel
;

Kernel
kernel
;

kernel
=

new

Kernel
()

;

if

(
args
.
length
==

0

)

{

System
.
out
.
println
(

“Usage: java deadlock … ”
);

System
.
exit
(

0

)

;

}

if

(
args
.
length
>

0

)

{

kernel
.
setProcessFilenamePrefix
(
args
[
0
]

)

;

}

if

(
args
.
length
>

1

)

try

{

kernel
.
setProcessCount
(

Integer
.
valueOf
(
args
[
1
]).
intValue
()

)

;

}

catch

(
NumberFormatException
e
)

{

System
.
err
.
println
(

“Invalid number \””

+
args
[
1
]

+

“\” specified as process count”
);

System
.
exit
(
0
);

}

if

(
args
.
length
>

2

)

{

kernel
.
setResourceCount
(
args
.
length
–

2

)

;

}

public

static

void
allocate
(

int
id
,

Resource
resource
)

{

resource
.
setCurrentAvailable
(
resource
.
getCurrentAvailable
()

–

1

)

;

// we also need to note that the process has the resource allocated to it

Process
p
=

(
Process
)
processes
.
elementAt
(
id
);

p
.
addAllocatedResource
(
resource
)

;

}

public

static

void
deallocate
(

int
id
,

Resource
resource
)

{

resource
.
setCurrentAvailable
(
resource
.
getCurrentAvailable
()

+

1

)

;

// we also need to note that this process no longer has the resource allocated

Process
p
=

(
Process
)
processes
.
elementAt
(
id
);

p
.
removeAllocatedResource
(
resource
)

;

}

/**

all processes are blocked. One of them should be killed

and its resources deallocated so that the others can continue.

public

static

void
deadlocked
()

{

}

__MACOSX/deadlock/._DeadlockManager.java

deadlock/Kernel.java

import
java
.
util
.
Vector

;

import
java
.
lang
.
Thread

;

import
java
.
io
.
IOException

;

public

class

Kernel

extends

Thread

{

private

int
time
=

0

;

private

int
sleepTime
=

1000

;

private

String
processFilenamePrefix
=

“process”

;

private

String
processFilenameSuffix
=

“.dat”

;

private

int
processCount
;

private

int
resourceCount
;

private

Vector
processes
=

new

Vector
()

;

private

Vector
resources
=

new

Vector
()

;

private

ControlPanel
controlPanel
;

private

boolean
stepping
=

false

;

private

int
haltedCount
=

0

;

private

int
blockedCount
=

0

;

public

void
processHalted
()

{

haltedCount
++

;

}

public

void
processBlocked
()

{

blockedCount
++

;

}

public

void
processUnblocked
()

{

blockedCount
—

;

}

public

void
setProcessFilenamePrefix
(

String
newProcessFilenamePrefix
)

{

processFilenamePrefix
=
newProcessFilenamePrefix
;

}

public

String
getProcessFilenamePrefix
(

)

{

return
processFilenamePrefix
;

}

public

void
setProcessFilenameSuffix
(

String
newProcessFilenameSuffix
)

{

processFilenameSuffix
=
newProcessFilenameSuffix
;

}

public

String
getProcessFilenameSuffix
(

)

{

return
processFilenameSuffix
;

}

public

boolean
getStepping
()

{

return
stepping
;

}

public

void
setStepping
(

boolean
newStepping
)

{

stepping
=
newStepping
;

}

public

int
getTime
(

)

{

return
time
;

}

public

void
setTime
(

int
newTime
)

{

time
=
newTime
;

}

public

int
getSleepTime
()

{

return
sleepTime
;

}

public

void
setSleepTime
(

int
newSleepTime
)

{

sleepTime
=
newSleepTime
;

}

public

void
setControlPanel
(

ControlPanel
newControlPanel
)

{

controlPanel
=
newControlPanel
;

}

public

void
setProcessCount
(

int
newProcessCount
)

{

if

(
newProcessCount
>
processCount
)

{

process
.
timeToCompute
—

;

running
=

false

;

}

else

process
.
state
=

Process
.
STATE_UNKNOWN
;

break

;

case

Process
.
STATE_RESOURCE_WAIT
:

if

(

DeadlockManager
.
available
(
i
,
process
.
resourceAwaiting
)

)

if

(

DeadlockManager
.
grantable
(
i
,
process
.
resourceAwaiting
)

)

{

DeadlockManager
.
allocate
(
i
,
process
.
resourceAwaiting
)

;

process
.
state
=

Process
.
STATE_UNKNOWN
;

process
.
resourceAwaiting
=

null

;

processUnblocked
()

;

}

else

running
=

false

;

// continue to be blocked on this resource

else

running
=

false

;

// continue to be blocked on this resource

break

;

case

Process
.
STATE_HALT
:

// we’re already stopped, no need to do anything

running
=

false

;

break

;

}

printStatus
()

;

time
++

;

updateControlPanel
();

}

private

void
updateControlPanel
()

{

controlPanel
.
setTime
(
time
)

;

{

GridBagConstraints
gbc
;

GridBagLayout
gbl
=

(
GridBagLayout
)
this
.
getLayout
();

Label
idLabel
;

Label
availableLabel
;

// add the objects to the vector

// add the objects to the panel

{

// remove the objects from the panel

this
.
remove
(

(
Label
)
processIdLabelVector
.
elementAt
(
i
)

)

;

this
.
remove
(

(
Label
)
processStateLabelVector
.
elementAt
(
i
)

)

;

this
.
remove
(

(
Label
)
processResourceLabelVector
.
elementAt
(
i
)

)

;

// remove the objects from the vector

processIdLabelVector
.
removeElementAt
(
i
)

;

processStateLabelVector
.
removeElementAt
(
i
)

;

processResourceLabelVector
.
removeElementAt
(
i
)

;

}

// redo the layout of the panel

this
.
layout
();

}

processCount
=
newProcessCount
;

}

public

void
setProcesses
(
Vector
newProcesses
)

{

processes
=
newProcesses
;

}

public

void
setProcessId
(

int
i
,

int
newId
)

{

Label
label
=

(
Label
)
processIdLabelVector
.
elementAt
(
i
)

;

Dimension
oldSize
=
label
.
getSize
()

;

label
.
setText
(
Integer
.
toString
(
newId
))

;

Dimension
newSize
=
label
.
getMinimumSize
()

;

if

(
newSize
.
width
>
oldSize
.
width
)

label
.
invalidate
();

}

public

void
setProcessState
(

int
i
,

String
newState
)

{

Label
label
=

(
Label
)
processStateLabelVector
.
elementAt
(
i
)

;

Dimension
oldSize
=
label
.
getSize
()

;

label
.
setText
(
newState
)

;

Dimension
newSize
=
label
.
getMinimumSize
()

;

if

(
newSize
.
width
>
oldSize
.
width
)

label
.
invalidate
();

}

public

void
setProcessResource
(

int
i
,

String
newResourceName
)

{

Label
label
=

(
Label
)
processResourceLabelVector
.
elementAt
(
i
)

;

Dimension
oldSize
=
label
.
getSize
()

;

label
.
setText
(
newResourceName
)

;

Dimension
newSize
=
label
.
getMinimumSize
()

;

if

(
newSize
.
width
>
oldSize
.
width
)

label
.
invalidate
();

}

public

void
init
()

{

GridBagConstraints
gbc
;

GridBagLayout
gbl
=

new

GridBagLayout
()

;

processesLabel
=

new

Label
(

“Processes”

)

;

gbc
=

new

GridBagConstraints
()

;

gbc
.
gridx
=

1

;

gbc
.
gridy
=

1

;

gbc
.
gridwidth
=

GridBagConstraints
.
REMAINDER
;

gbl
.
setConstraints
(
processesLabel
,
gbc
)

;

idLabel
=

new

Label
(

“Id”

,

Label
.
RIGHT
)

;

gbc
=

new

GridBagConstraints
()

;

gbc
.
gridx
=

1

;

gbc
.
gridy
=

2

;

gbl
.
setConstraints
(
idLabel
,
gbc
)

;

stateLabel
=

new

Label
(

“State”

,

Label
.
RIGHT
)

;

gbc
=

new

GridBagConstraints
()

;

gbc
.
gridx
=

2

;

gbc
.
gridy
=

2

;

gbl
.
setConstraints
(
stateLabel
,
gbc
)

;

resourceLabel
=

new

Label
(

“Resource”

,

Label
.
RIGHT
)

;

gbc
=

new

GridBagConstraints
()

;

gbc
.
gridx
=

3

;

gbc
.
gridy
=

2

;

gbl
.
setConstraints
(
resourceLabel
,
gbc
)

;

{

GridBagConstraints
gbc
;

GridBagLayout
gbl
=

(
GridBagLayout
)
this
.
getLayout
();

Label
idLabel
;

Label
availableLabel
;

// add the objects to the vector

// add the objects to the panel

{

// remove the objects from the panel

this
.
remove
(

(
Label
)
resourceIdLabelVector
.
elementAt
(
i
)

)

;

this
.
remove
(

(
Label
)
resourceAvailableLabelVector
.
elementAt
(
i
)

)

;

// remove the objects from the vector

resourceIdLabelVector
.
removeElementAt
(
i
)

;

resourceAvailableLabelVector
.
removeElementAt
(
i
)

;

}

// redo the layout of the panel

this
.
layout
();

}

resourceCount
=
newResourceCount
;

}

public

void
setResources
(
Vector
newResources
)

{

resources
=
newResources
;

}

public

void
setResourceId
(

int
i
,

int
newId
)

{

Label
label
=

(
Label
)
resourceIdLabelVector
.
elementAt
(
i
)

;

Dimension
oldSize
=
label
.
getSize
()

;

label
.
setText
(
Integer
.
toString
(
newId
))

;

Dimension
newSize
=
label
.
getMinimumSize
()

;

if

(
newSize
.
width
>
oldSize
.
width
)

label
.
invalidate
()

;

}

public

void
setResourceAvailable
(

int
i
,

int
newAvailable
)

{

Label
label
=

(
Label
)
resourceAvailableLabelVector
.
elementAt
(
i
)

;

Dimension
oldSize
=
label
.
getSize
()

;

label
.
setText
(
Integer
.
toString
(
newAvailable
))

;

Dimension
newSize
=
label
.
getMinimumSize
()

;

if

(
newSize
.
width
>
oldSize
.
width
)

label
.
invalidate
()

;

}

public

void
init
()

{

GridBagConstraints
gbc
;

GridBagLayout
gbl
=

new

GridBagLayout
()

;

resourcesLabel
=

new

Label
(

“Resources”

)

;

gbc
=

new

GridBagConstraints
()

;

gbc
.
gridx
=

1

;

gbc
.
gridy
=

1

;

gbc
.
gridwidth
=

GridBagConstraints
.
REMAINDER
;

gbl
.
setConstraints
(
resourcesLabel
,
gbc
)

;

idLabel
=

new

Label
(

“Id”

,

Label
.
RIGHT
)

;

gbc
=

new

GridBagConstraints
()

;

gbc
.
gridx
=

1

;

gbc
.
gridy
=

2

;

gbl
.
setConstraints
(
idLabel
,
gbc
)

;

availableLabel
=

new

Label
(

“Available”

,

Label
.
RIGHT
)

;

gbc
=

new

GridBagConstraints
()

;

gbc
.
gridx
=

2

;

gbc
.
gridy
=

2

;

gbl
.
setConstraints
(
availableLabel
,
gbc
)

;

In the docs for the ArrayList class, you will see it described as
ArrayList. That means ArrayList is a container for objects
of any type, which we call T.

When we create an ArrayList we have to say what type of thing
we want to store in it, i.e. what is the type T.

16 / 28

Abstract data types

Using generics we can create a Stack that works for any type:

1 public class Stack {

2 //…

3 public void push(T e) { … }

4 public T pop() { … }

5 public T peek() { … }

6 }

8 //

10 Stack myIntStack = new Stack ();

11 myIntStack.push (42); //OK

12 myIntStack.push(‘‘Hi!’’); //compile -time error

17 / 28

Balancing parens

We can model this problem with a stack. Every time we encounter
an opening paren character, push it onto a stack.

Every time we encounter a closing paren, pop the stack and check
that the types match.

18 / 28

Balancing parens

{([])}

push(’{’)

head{

19 / 28

Balancing parens

{([])}

push(’(’)

head

{

(

20 / 28

Balancing parens

{([])}

push(’[’)

{

(

head[

21 / 28

Balancing parens

{([])}

pop()==’[’

{

(

head[

22 / 28

Balancing parens

{([])}

pop()==’(’

{

( head

[

23 / 28

Balancing parens

{([])}

pop()==’{’

{

(

head

[

24 / 28

Balancing parens

An unbalanced example.
([(]))

push(’(’)

( head
25 / 28

Balancing parens

An unbalanced example.
([(]))

push(’[’)

(

head[

26 / 28

Balancing parens

An unbalanced example.
([(]))

push(’(’)

(

head

[

(

27 / 28

Balancing parens

An unbalanced example.
([(]))

pop()!=’[’

(

head

[

(

28 / 28

Simple data structures
The stack

__MACOSX/Data structures assignment/._week2c

Data structures assignment/week3c

CI583: Data Structures and Operating Systems

Balanced Trees

1 / 36

Outline

1 Unbalanced trees

2 Red-black trees

3 Rotations

4 Inserting to a red-black tree

5 Deleting from a red-black tree

2 / 36

Unbalanced trees

Last time we saw how powerful and �exible a data structure is the

tree. We saw that binary search trees can provide O(log n)
retrieval, insertion and removal.

However, this is only true so long as the tree remains fairly well

balanced. If we insert sequential data to a tree then the nodes

arrange themselves just like a linked list. Performance degrades

to linear time.

3 / 36

Unbalanced trees

Say we have a tree made up of 10,000 nodes. If the tree is

maximally unbalanced, then the worst-case scenario of searching for

an item is that it takes 10,000 steps. If the tree is completely

balanced (or complete, or full), the worst-case scenario is 14.

4 / 36

Unbalanced trees

Most of the time trees may not be maximally unbalanced but

inputting a run of sequential data may cause it to be partially

unbalanced, or it may have begun with a very small or large root.

In a tree of natural numbers, for instance, if the root is labelled 3

there can be at most two nodes in the left hand sub-tree.

Operations on a tree like this will be somewhere between O(n) and
O(log n).

5 / 36

Self-balancing trees

Self-balancing trees are the solution to this problem. The �rst of

these were AVL Trees, invented by Adelson-Velskii and Landis in

1962.

The idea is that self-balancing tree maintain the invariant that no

path from root to leaf is more than twice as long as any other. To

achieve this, the tree must re-balance itself after insertion and

deletion.

6 / 36

Self-balancing trees

Variations on this idea are used in �le system design, relational

databases, and whenever fast access to a large amount of data is

required.

For instance, relational databases store indices in memory in a

self-balancing tree structure called a BTree or B+ Tree, providing

logarithmic access time with little or no IO.

Linux �le systems such as ext3 store directory listings in a HTree,

which uses a hash function to create a two-level balanced tree of

�les. HTree indexing improved the scalability from a practical limit

of a few thousand �les, into the range of tens of millions of �les per

directory.

7 / 36

Red-black trees

The type of self-balancing tree we will consider in detail is a BST

called the red-black tree. Like the heap we saw in the last lecture,

we de�ne a series of invariants for RB-trees and make sure that

they will all still hold after each operation.

Red-black trees were invented in 1972 by Bayer.

8 / 36

Red-black trees

The invariants on RB-trees:

1 Each node is either red or black (think of this �colour� as an

extra bit � we could use 1 or 0 or any other choice).

2 The root and leaves are black.

3 If a node is red, its children must be black.

4 For each node, x, every path from x to a leaf contains the
same number of black nodes.

The motivation for these conditions is probably mysterious to you,

but we will see that maintaining them results in a balanced tree and

gives us the logarithmic performance we want.

9 / 36

Red-black trees

An example. The small �lled black circles represent null pointers in

the leaves (not normally depicted). These are always black. We

won’t normally show them but this is what we mean when we say

the �leaves� are black.

183

26261

10 / 36

Red-black trees

The black-height of a node x is the number of black nodes on a
path from x to a leaf, not including x. So we can state property 4
in terms of black-height.

183

26261

11 / 36

Red-black trees

Because we include the null pointers a red-black tree is a branching

BST � every node has 2 or 0 children. The properties force a

red-black tree with n nodes to have O(log n) height. Actually, the
height will be 2(log n + 1) � see (Cormen 2009, p309) for a proof.

Queries (e.g. search, �nd the minimum or maximum element etc)

will require a visit to every level at worst, giving us O(h) or
O(log n) time. Updates (insertion and deletion) are more tricky.

12 / 36

Rotations

Before we can describe how to update a red-black tree, we need to

understand rotations. A rotation is a local change to the structure

of the tree that preserves the RB properties.

x y

left-rotate(x)

right-rotate(y)

13 / 36

Rotations

balanced RB trees.

x y

left-rotate(x)

right-rotate(y)

14 / 36

Rotations

We can easily see that rotations preserve the BST property: The

keys in α are less than the key of x, which is less than the key of y,
and so on.

x y

left-rotate(x)

right-rotate(y)

15 / 36

Rotations

An example within a BST.

114

3 6

left-rotate(T, x)

16 / 36

Rotations

An example within a BST.

193 6

x y

17 / 36

Rotations

Rotations take constant time since they only involve switching

some pointers around. Recolouring is, of course, also done in

constant time.

We will see that these two techniques are all we need to maintain

the properties in a red-black tree.

18 / 36

Inserting to a red-black tree

We insert an element, x, to a red-black tree, T, as follows:

1 Insert x as if T were an ordinary BST � see last lecture. This
step may break the RB properties.

2 Colour x red.

3 Restore the RB properties by recolouring and rotating.

After restoring the RB properties we know that the new tree, T ′, is
balanced (by the proof in Cormen mentioned before).

19 / 36

Inserting to a red-black tree

Demo

20 / 36

Case 1: Recolouring

We can recolour a node whenever doing so does not change the

black-heights of the tree. This occurs when the parent and uncle

(other child of the grandparent) of the node are both red.

A D

Recolouring moves the problem up the tree. A is shown with only
one child because it doesn’t matter if B is the right or left child.

21 / 36

Case 2: left rotation

If we can’t recolour any more, we use rotations. The �rst case is

where z, the violating node, is the right child of its parent. We use
a left rotation to achieve the situation where z is the left child.

z z

22 / 36

Case 3: right rotation and recolouring

Case 2 is followed immediately by case 3, in which we use a right

rotation and recolouring.

A Cz

Note that case 2 falls through into case 3, but case 3 is a case of

its own � i.e. if case 2 is not applicable we may still be able to

apply case 3.

23 / 36

Case 3: right rotation and recolouring

nodes in a row, the algorithm terminates.

A Cz

24 / 36

Red-black insertion
A complete example

Preparing to insert a value to a red-black tree.

3 18

118

insert 15

25 / 36

Red-black insertion
A complete example

After inserting the new node and colouring it red, we have broken

property 3. Looking at the grandparent of the new node, we have a

candidate for recolouring.

3 18

118

26 / 36

Red-black insertion
A complete example

Now the violation has moved further up the tree and we can’t do

any more recolouring. The violating node is the left child of its

parent, so use right rotation.

3 18

118

27 / 36

Red-black insertion
A complete example

We have straightened out the dog-leg. now the violating node is

the right child of its parent. Rotate the left.

3 10

28 / 36

Red-black insertion
A complete example

Recolour the root and we are done.

118

29 / 36

Red-black insertion

The pseudocode for insertion to a red-black tree is quite easy to

follow but a bit too long to go on a slide, simply because there are

a lot of cases to consider. Again, see Cormen for an example.

30 / 36

Red-black deletion

Similarly to insertion, we delete from a red-black tree just as we

would from a BST, then call a ��xup� routine to repair the RB

properties that might have been broken in the previous step. Again,

properties are �xed by recolouring and rotation.

Deleting a red node cannot violate the RB properties so we only

call the ��xup� routine when the node we removed was black.

31 / 36

Red-black deletion

Let y be a black node removed from a red-black tree and x be the
node that takes its place. What might have been broken by the

removal of y?

1 If y was the root of the tree and x is red, we have violated
property 2.

2 If x and x.p are both red, we have violated property 3.

3 The removal of y means that there is one less black node in
any path through x, so we have de�nitely violated property 4.

To get around the last problem we start by saying that x is either
doubly black (if it was black to start with) or red-black. In this way,

x contributes 2 or 1 to the black-height of any path passing
through it, rather than 1 or 0.

32 / 36

Red-black deletion

blackness up the tree until:

1 x points to a red-black node, in which case we colour it back.

2 x points to the root, in which case we are done.

3 We apply recolouring and rotations until the properties are

�xed.

33 / 36

Red-black deletion

Thus, whilst x is a non-root doubly black node, we have changes
that need to be made. The cases are as follows:

1 Case 1: x’s sibling, w, is red. Rotate and recolour.

2 Case 2: w is black and both of w’s children are black.
Recolour and move the problem further up the tree.

3 Case 3: w is black, w.left is red and w.right is black.
Recolour and rotate.

4 Case 4: w is black, w.left is red. Recolour and rotate.

Note that these cases aren’t mutually exclusive.

34 / 36

Inserting to a red-black tree

Demo

Note to self: try [10, 34, 48, 79, 83], delete 10.

35 / 36

Next week

An overview of some algorithmic strategies: divide-and-conquer,

greedy algorithms, backtracking, dynamic programming.

36 / 36

Unbalanced trees
Red-black trees
Rotations
Inserting to a red-black tree
Deleting from a red-black tree

__MACOSX/Data structures assignment/._week3c

Data structures assignment/week3b

CI583: Data Structures and Operating Systems

Binary Trees

1 / 40

Binary search trees

Trees start to get interesting when we place some constraints on

their structure. One constraint is that their labels are ordered. If we

do this with a binary tree we get a binary search tree (BST),

de�ned as follows:

1 for each non-leaf node, n, the key of the left child is less than
the key of n and the key of the right child is greater than the
key of n,

2 keys are unique, and

3 the left and right children of n are binary search trees.

2 / 40

Binary search trees

which either has a key equal to k or k was not found.

3 / 40

Inserting to a BST

Inserting a new key, k, works similarly. We �nd the right place to
put k by following branches until the sub-tree to follow does not
exist, then we attach a new node with k as the label.

4 / 40

Inserting to a BST

7>5

5 / 40

Inserting to a BST

64
7>6

6 / 40

Inserting to a BST

In this example:

a is the prefix for the names of the files containing the commands, (the actual names of the
files are “a0.dat”, and “a1.dat”),

2 is the number of processes to be created,

1 is the number of instances to create for the first resource, and

a.log is the name of the output file.

Typically, you will use the step button to execute a cycle of the simulation and observe the
effect on the resources and processes. When you’re done, quit the simulation using the exit
button.

The Command Line

The general form of the command line is

$ java deadlock file-name-prefix initial-number-of-processes initial-available-
for-resource …

where

Parameter Description
file-name-
prefix

initial-number-
of-processes

initial-
available-for-
resource…

The Control Panel

The main control panel for the simulator includes a row of command buttons, and an
informational display.

The buttons:

Button Description
run runs the simulation to completion. Note that the simulation pauses and updates

the screen between each step.
stop stops the simulation if it is running. This button is only active if the run button

has been pressed.
step runs a single setup of the simulation and updates the display.
reset initializes the simulator and starts from the initial values for each process and

resource.
options allows you to change various options for the simulator, including the number of

resources and the number of processes.
resources allows you to change the configuration for each resource, including the initial

and current number of instances available for each resource.
processes allows you to change the configuration for each process, including current state

and the name of the command file for that process.

exit exits the simulation.

The informational display:

Field Description
Time: number of “milliseconds” since the start of the simulation.
Resource
Id:

A number which identifies the particular resource. Resources are numbered
starting with zero.

Resource
Available:

The number of instances available for the particular resource. This is a non-
negative number.

Process Id: A number which identifies the particular process. Processes are numbered
starting with zero.

Process
State:

Process
Resource:

The resource for which this process is waiting, if any. This field only has a
value if the process is in W status.

The Options Dialog Box

The Options Dialog Box allows you to set general options for the simulator.

The options:

Field Description
Number of
Processes:

Number of
Resources:

Milliseconds
per step:

The Processes Dialog Box

The Processes Dialog Box allows you to enter or modify properties for each process.

The process properties:

Field Description
Number of
Processes:

The number of processes in the simulation. To change this value, use the
Options Dialog (see above).

Process Id The id number for the process. These numbers are used to identify each
process and are assigned by the simulator, starting with zero. These numbers
cannot be changed.

Process File
Name

The Resources Dialog Box

The Resources Dialog Box allows you to enter and modify properties for each resource.

The resource properties:

Field Description
Number of
resources:

The number of resources available in the simulation. To change this value,
use the Options Dialog (see above).

Resource
Initial

The initial number of available instances of the resource. This number is
used when the simulator starts or is reset.

Resource
Current

The current number of available instances of the resource. This number may
be changed during the simulation to see the effect it may have on processes
waiting for the resource.

Operation Description
C msec Compute for the specified number of milliseconds (cycles).
R resource-
id

F resource-
id

H Halt the process. This is usually the last operation in the file. Any commands
which follow it in the file are ignored. Any file that does not end with this
operation is implicitly halted.

Sample Process Command Files

The “a0.dat” input file looks like this:

/*
a0.dat

Note that the “a1.dat” file is identical. In other words, both files request the same resources at
approximately the same time.

The Output File
The output file contains a log of the simulation since the simulation started.

The output file contains one line per cycle executed. The format of each line is:

time = t available = r0 r1 … rn blocked = n

where
t

is the number of milliseconds since the start of the simulation,
ri

is the number of available instances of each resource, and
n

is the number of blocked processes.

Sample Output

The output file “a.log” looks something like this:

Suggested Exercises
1. Try running the deadlock simulator using the following command:

java deadlock a 2 2

Explain why a deadlock does not occur.

2. There are two additional process command files (“b0.dat” and “b1.dat”) in the
distribution. Run the deadlock simulator with this command:

java deadlock b 2 1 1

What happens?

Now try this.

java deadlock b 2 1 2

Why does the first command result in a deadlock but the second does not? Explain your
answer in terms of what is going on in the process command files, b0.dat and b1.dat.

__MACOSX/Data structures assignment/._Deadlock-Simulator

Data structures assignment/week11d

Distributed systems

CI583: Data Structures and Operating Systems
Distributed systems

1 / 21

Distributed systems

Distributed operating systems

2 / 21

Distributed systems

Distributed operating systems

This can include clusters and cloud computing, but also more
tightly coupled distributed systems.

3 / 21

Distributed systems

Distributed operating systems
Truly distributed OS

4 / 21

Distributed systems

Distributed operating systems

The motivation for this architecture includes:

1 Load balancing.

2 Reliability, redundancy and fault tolerance. Several nodes can
work on the same task, and the first one to complete.

3 Availability.

The danger is that nodes might spend most of their time
communicating with other nodes…

5 / 21

Distributed systems

Distributed operating systems

Within such a system, each machine runs a microkernel that
manages that node’s hardware and handles communication with
other nodes.

Each node also contains one or more system management
components, that provide that node’s functionality.

6 / 21

Distributed systems

Distributed operating systems

A key distinction between such systems is how the distribution is
managed:

1 centralised: all activity is managed by a single master node,

2 decentralised: a tree-like structure where nodes are branches,
and manage the activity of nodes beneath them, or leaves,

3 distributed: each node has one or more links to other nodes
and there is no master. There may be no way for any node to
“see” the entire system.

7 / 21

Distributed systems

Distributed operating systems
Distributed architecture: centralised

Master

Node

Node Node

Node

8 / 21

Distributed systems

Distributed operating systems
Distributed architecture: decentralised

Master

Node

Node Node

Node

9 / 21

Distributed systems

Distributed operating systems
Distributed architecture: clustered

Node

10 / 21

Distributed systems

Distributed operating systems

These tightly-coupled system were the focus of research for
decades from the 1970s onward, but no system ever solved the
problem of efficient communication between nodes.

The OS usually communicates with IO devices like network cards
via a high speed bus, for example. Doing the same thing over a
LAN is far slower.

Since the early 1990s, attention has turned to more loosely-coupled
systems such as clusters – collections of independent machines
each with their own OS.

11 / 21

Distributed systems

Clusters

Clusters of quite cheap machines have been assembled that rival
the world’s fastest and most expensive individual computers.

Using commodity machines means that when a node fails, it is just
thrown out and replaced (though it might just be left in place for a
while and collected with other dead nodes in one sweep).

12 / 21

Distributed systems

Clusters

A cluster provides large-scale parallelism (small-scale being taking
advantage of multiple cores), so that jobs can be broken up into
tasks that are worked on simultaneously.

Most cluster architectures still rely on a master node whose failure
will wipe out the system.

13 / 21

Distributed systems

Clusters

http://research.google.com/archive/mapreduce.html

14 / 21

http://research.google.com/archive/mapreduce.html

Distributed systems

Clusters

The Compute Engine has 96,250 physical machines, with access to
770,000 cores. Each core has access to 3.75GB of RAM.

Figures from http://www.extremetech.com/.

15 / 21

http://www.extremetech.com/

Distributed systems

Clusters

Hadoop is a FOSS system based on a reverse-engineering of
MapReduce. It is used by Facebook, Yahoo!, Amazon and others.

Three problems to solve:

1 Message passing.

2 Task scheduling.

3 Node failure.

16 / 21

Distributed systems

Clusters
Message passing

Especially in large clusters, it is important to minimise the distance
of each node from the master. Hadoop is “rack-aware”, so it
knows the physical location of each node.

17 / 21

Distributed systems

Clusters
Task scheduling

Deciding which tasks should be given to which nodes is an open
problem.

MapReduce has a process on the master node called the
JobTracker, which breaks work down into individual tasks.

18 / 21

Distributed systems

Clusters
Task scheduling

MapReduce uses a strategy of high redundancy – several nodes
might be working on the same item of work and so each item is
guaranteed to be done at least once, rather exactly once.

19 / 21

Distributed systems

Clusters
Node failure

When a node fails or appears to be malfunctioning, that task is
re-executed (unless a high-redundancy strategy is already in place,
meaning that another node is already working on the task).

This is called resource fencing.

20 / 21

Distributed systems

Clusters

Like virtualisation, clustered computing is an idea that has really
taken off.

Rather than running their own server room, a small-to-medium
sized business would now be much better off outsourcing this
problem to a company like Google or Amazon S3.

21 / 21

Distributed systems

Clusters

The company’s computing needs are then served by a cluster
sitting in the cloud.

The nodes in the cluster are probably running VMs.

This takes care of load-balancing, reliability, backup, expansion and
contraction of the network, load on individual nodes and
bandwidth usage.

22 / 21

Distributed systems

__MACOSX/Data structures assignment/._week11d

Data structures assignment/week10b

64-bit systems

CI583: Data Structures and Operating Systems
Memory management part 2

1 / 15

64-bit systems

Page tables

The system we have described so far is a complete page table.

In this system, every page in the addressable memory has an entry
in the page table, even though most of these entries will have the
validity bit set to zero.

Rather than simply holding an entire page table that maps to all
addressable memory, different schemes are used to store page
tables more efficiently.

2 / 15

64-bit systems

Page tables

Such schemes are needed because page tables take up too much
space and have other benefits – for instance, by using an indirect
mapping we can increase the addressable memory.

In various ways, these schemes aim to hold in memory only those
parts of the page table that map to the parts of primary storage
currently being used.

3 / 15

64-bit systems

Forward-mapped page tables

Forward-mapped (or multilevel) page tables break a virtual
memory location down into three parts:

the Level 1 page number (L1),

Level 2 page number (L2), and

the offset.

4 / 15

64-bit systems

Forward-mapped page tables

L1 page no. offset

Virtual memory location

L1 page table

L2 page no.

L2 page tables Page frame

Image adapted from (Doeppner, 2011)

5 / 15

64-bit systems

Forward-mapped page tables

As before, entries in the L1 and L2 tables can be valid or invalid.

Looking up a location means finding the entry in L1, which indexes
a particular L2 page table.

Then the entry in the L2 table is looked up – this points to a page
frame (or is invalid).

6 / 15

64-bit systems

Forward-mapped page tables

The benefit of this system is that not all L2 tables need to be in
memory at any one time, greatly reducing the amount of memory
taken up.

However, translating a virtual memory location now takes two or
three lookups.

7 / 15

64-bit systems

Linear page tables

Linear page tables break up the entire address space into several
(e.g 4) spaces, each with its own page table which is itself held in
virtual memory.

This usually requires fewer memory accesses than forward-mapped
tables, because it takes advantage of the fact that the most
common pattern of memory access within processes is contiguous.

8 / 15

64-bit systems

Hashed page tables

Hashed page tables take a different approach to the ones we’ve
seen so far.

The page table is loaded with translation information for only those
parts of the address space which are in use (i.e. no invalid entries).

We access the page table using a function which hashes the virtual
memory location.

9 / 15

64-bit systems

Hashed page tables

Each entry in the page table contains, as well as the page frame
number, the original page number and a link to the next entry with
the same hash value.

Thus, looking up a page table entry might mean looking up the
entry based on the hash, then following one or more links in the
table to find the real entry.

10 / 15

64-bit systems

Hashed page tables

Page no. offset

Virtual memory location

Page table Page frame

f(#)

TAG

ENTRY

NEXTNEXT

TAG

ENTRY

TAG

ENTRY

Image adapted from (Doeppner, 2011)

11 / 15

64-bit systems

Hashed page tables

This approach is particularly efficient when dealing with small
regions of allocated space in a larger, sparsely-populated data
region.

However, the page tables become vary large, since each entry
requires three words. This problem is solved by using clustered
page tables.

Using this approach, multiple pages are grouped together into
superpages. Which page we actually require can be inferred from
the result of the hash function.

12 / 15

64-bit systems

Hashed page tables

Page no. offset

Virtual memory location

Page table Page frame

f(#)

TAG

ENTRY

TAG

Image adapted from (Doeppner, 2011)
13 / 15

64-bit systems

Page tables in 64-bit systems

64-bit systems provide 264 bits of addressable space.

A complete page table would occupy 16 petabytes (as opposed to
4MB for a 32-bit system)!

14 / 15

64-bit systems

Page tables in 64-bit systems

The other 32-bit solutions are also “worse” in this setting – e.g. a
forward-mapped page table with 4KB pages would require 4 to 7
lookups per translation.

The x64 architecture uses forward-mapped page tables with four
levels of 4KB pages and a subsequent 3 levels of 2MB pages.

15 / 15

64-bit systems

__MACOSX/Data structures assignment/._week10b

Data structures assignment/week10c

Virtual memory in practice Fetch policies

CI583: Data Structures and Operating Systems
Memory management part 3

1 / 12

Virtual memory in practice Fetch policies

Outline

1 Virtual memory in practice

2 Fetch policies

2 / 12

Virtual memory in practice Fetch policies

Virtual memory in practice

The aim of virtual memory is to allow the OS to address a larger
amount of memory than the available primary storage.

The job of the OS is to make sure that using the (expensive)
techniques of virtual memory can be done as efficiently as possible.

3 / 12

Virtual memory in practice Fetch policies

Virtual memory in practice

4 / 12

http://www.cs.odu.edu/~cs471w

Virtual memory in practice Fetch policies

Virtual memory in practice

Consider the actions required when a page fault occurs:

1 Hardware address translation facility raises an interrupt.

2 Find a free page frame.

3 If there are no free frames, decide which existing frame to
reuse.

4 If the frame we’re reusing contains a modified page, write it
out to secondary storage.

5 Fetch the required page from secondary storage and modify
the page table.

6 Return from interrupt.

5 / 12

Virtual memory in practice Fetch policies

Virtual memory in practice

This is an expensive process, especially the IO involved, which
could take milliseconds to complete.

We have seen how to map virtual memory locations to page frames
in primary storage, and we have seen how to move pages from
secondary to primary storage when a page fault occurs.

Now we need to know which pages to hold in primary storage and
which to get rid of.

6 / 12

Virtual memory in practice Fetch policies

Virtual memory in practice

We can break this down into three areas:

1 The fetch policy: when to bring pages in from secondary
storage and which ones to bring.

2 The placement policy: where to put the pages in primary
storage (i.e. how to allocate page frames).

3 The replacement policy: which pages to remove from primary
storage and when to do it.

7 / 12

Virtual memory in practice Fetch policies

Fetch policies

Probably the simplest fetch policy we can imagine is demand
paging: fetch a page (only) when a thread references a location in
a page that is not in primary storage.

Thus, the execution of a program, P, begins by loading a single
page that contains the initial instructions for P. Each subsequent
page is retrieved as needed.

If the cost of fetching n pages is n× the cost of fetching one page,
this is the best we can do! It isn’t though – why not?

8 / 12

Virtual memory in practice Fetch policies

Fetch policies

As we have said, we want to minimise the IO in all of this and
reduce the number of faults that occur.

One way to do this is by prepaging: fetching pages before they are
actually required (i.e. without having to go through a whole page
fault in order to get it).

9 / 12

Virtual memory in practice Fetch policies

Fetch policies

How do we know which pages a process will require, before it
actually requires them?

Most of the time there is no way to know this, but it is a fair bet
that if a process requests a given page, it might well request its
subsequent pages next.

Fetching 2 or 3 pages in one go is not much more expensive than
fetching one, since the file system, disk controller etc are already
engaged. This is called readahead.

10 / 12

Virtual memory in practice Fetch policies

Fetch policies

Another way in which the fetch policy can have a big impact is in
choosing a page size that reflects the type of files in the system
and the way they are accessed.

http://research.google.com/archive/gfs.html

11 / 12

http://research.google.com/archive/gfs.html

Virtual memory in practice Fetch policies

Next time

Policies for placement and replacement.

12 / 12

Virtual memory in practice
Fetch policies

__MACOSX/Data structures assignment/._week10c

Data structures assignment/filesys.zip

filesys/BitBlock.java

public

class

BitBlock

extends

Block

{

/**

* Construct a bit block of the specified size in bytes.

*
@param
blockSize the size of the block in bytes

public

BitBlock
(

short
blockSize
)

{

super
(
blockSize
)

;

}

/**

* Set a specified bit to 1 (true).

*
@param
whichBit the bit to set

public

void
setBit
(

int
whichBit
)

{

* Usage:


 *   java cat input-file ...

 *

public

class
cat

{

/**

* The name of this program.

* This is the program name that is used

* when displaying error messages.

public

static

final

String
PROGRAM_NAME
=

“cat”

;

/**

* The size of the buffer to be used for reading from the

* file. A buffer of this size is filled before writing

* to the output file.

public

static

final

int
BUF_SIZE
=

4096

;

/**

* Reads files and writes to standard output.

*
@exception
java.lang.Exception if an exception is thrown

* by an underlying operation

public

static

void
main
(

String
[]
argv
)

throws

Exception

{

// initialize the file system simulator kernel

Kernel
.
initialize
()

;

// display a helpful message if no arguments are given

if
(
argv
.
length
==

0

)

{

System
.
err
.
println
(
PROGRAM_NAME
+

“: usage: java ”

+
PROGRAM_NAME
+

” input-file …”

)

;

Kernel
.
exit
(

1

)

;

}

// for each filename specified

* Usage:


 *   java cp input-file output-file

 *

public

class
cp

{

/**

* The name of this program.

* This is the program name that is used

* when displaying error messages.

public

static

final

String
PROGRAM_NAME
=

“cp”

;

/**

* The size of the buffer to be used when reading files.

public

static

final

int
BUF_SIZE
=

4096

;

/**

* The file mode to use when creating the output file.

// ??? perhaps this should be the same mode as the input file

public

static

final

short
OUTPUT_MODE
=

0700

;

/**

* Copies an input file to an output file.

*
@exception
java.lang.Exception if an exception is thrown by

* an underlying operation

public

static

void
main
(

String
[]
argv
)

throws

Exception

{

// initialize the file system simulator kernel

Kernel
.
initialize
()

;

// make sure we got the correct number of parameters

if
(
argv
.
length
!=

2

)

{

System
.
err
.
println
(
PROGRAM_NAME
+

“: usage: java ”

+

PROGRAM_NAME
+

” input-file output-file”

)

;

Kernel
.
exit
(

1

)

;

}

// give the parameters more meaningful names

String
in_name
=
argv
[
0
]

;

String
out_name
=
argv
[
1
]

;

// open the input file

int
in_fd
=

Kernel
.
open
(
in_name
,

Kernel
.
O_RDONLY
)

;

buffer
[
offset
+
1
]

=

(
byte
)
d_ino
;

* Usage:


*   java dump input-file

*

public

class
dump

{

public

static

void
main
(

String
[]
args
)

{

System
.
out
.
print
(
j
+

” ”

+

Integer
.
toHexString
(
c
)

+

” ”

+
c
)

;

{

Kernel
.
setErrno
(

Kernel
.
EFBIG
)

;

return

–
1

;

}

// ask the IndexNode for the actual block number

// given the relative block number

int
blockOffset
=

indexNode
.
getBlockAddress
(
relativeBlockNumber
)

;

if
(
blockOffset
==

FileSystem
.
NOT_A_BLOCK
)

{

// clear the bytes if it’s a block that was never written

int
blockSize
=
fileSystem
.
getBlockSize
()

;

{

Kernel
.
setErrno
(

Kernel
.
EFBIG
)

;

return

–
1

;

}

// ask the IndexNode for the actual block number

// given the relative block number

int
blockOffset
=

indexNode
.
getBlockAddress
(
relativeBlockNumber
)

;

if
(
blockOffset
==

FileSystem
.
NOT_A_BLOCK
)

{

// allocate a block; quit if we can’t

blockOffset
=
fileSystem
.
allocateBlock
()

;

if
(
blockOffset
< 0 ) return - 1 ; // update the inode indexNode . setBlockAddress ( relativeBlockNumber , blockOffset ) ; // write the inode fileSystem . writeIndexNode ( indexNode , indexNodeNumber ) ; } // write the actual block from bytes fileSystem . write ( bytes , fileSystem . getDataBlockOffset () + blockOffset ) ; return 0 ; } } __MACOSX/filesys/._FileDescriptor.java filesys/filesys.conf ! This is the configuration file for the file system simulator. ! It is used by all programs which use Kernel.initialize() ! to set various parameters for the kernel or other components ! of the simulator. ! ! The default configuration file name is "filesys.conf". ! to specify an alternate configuration file, define a ! property "filesys.conf" which contains the name of the ! alternate file. For example: ! ! java -Dfilesys.conf=alternate.conf program-name parameter... ! ! where "alternate.conf" is the name of the new configuration file. ! ! You can use this file to set parameters for the kernel, ! the root file system, and the default process context. ! Each parameter is described below. ! ! ! filesystem.root.filename = filename-string ! ! Specifies the name of the file to "mount" as the root ! for the simulation. ! ! Default: ! ! filesystem.root.filename = filesys.dat ! filesystem.root.filename = filesys.dat ! ! filesystem.root.mode = mode-keyword ! ! Specifes the mode to use when opening the file. ! The mode should either be "rw" for reading and writing, ! or "r" for read-only access. ! ! Default: ! ! filesystem.root.mode = rw ! filesystem.root.mode = rw ! ! process.uid = short-decimal-value ! ! Specifies the numeric user id (uid) to use for the ! default process context. This should be a number between ! 0 and 32767. ! ! Default: ! ! process.uid = 1 ! process.uid = 1 ! ! process.gid = short-decimal-value ! ! Specifies the numeric group id (gid) to use for the ! default process context. This should be a number between ! 0 and 32767. ! ! Default: ! ! process.gid = 1 ! process.gid = 1 ! ! process.umask = short-octal-value ! ! Specifies the umask to use for the default process context. ! This should be an octal number between 000 and 777. ! ! Default: ! ! process.umask = 022 ! process.umask = 022 ! ! process.dir = path-string ! ! Specifies the working directory to be used for the default ! process context. This should be a string that starts with ! "/". ! ! Default: ! ! process.dir = /root ! process.dir = /root ! ! process.max_open_files = decimal-number ! ! Specifies the maximum number of files that may be open at ! one time by a process. When a process context is created, this ! many slots are created for possible open files. ! ! Default: ! ! process.max_open_files = 10 ! process.max_open_files = 10 ! ! kernel.max_open_files = decimal-number ! ! Specifies the maximum number of files that may be open ! at one time by all processes in the simulation. When the ! simulator starts, this many slots are created for possible ! open files. ! ! Default: ! ! kernel.max_open_files = 20 ! kernel.max_open_files = 20 filesys/filesys_user_guide.html MOSS File System Simulator User Guide Contents Purpose Introduction Overview Using File System Simulator Programs Using mkfs Using mkdir Using ls Using tee Using cp Using cat Dumping the File System Simulator Configuration File Configuration File Options A Sample Configuration File Specifying an Alternate Configuration File Writing File System Simulator Programs Enhancing the File System Simulator Suggested Exercises This document is a user guide for the MOSS File System Simulator. It explains how to use the simulator and describes the programs and the various input files used by and output files produced by the simulator. Introduction The file system simulator shows the inner workings of a UNIX V7 file system. The simulator reads or creates a file which represents the disk image, and keeps track of allocated and free blocks using a bit map. A typical exercise might be for students to write a program (in Java) which invokes various simulated operating system calls against a well-known disk image provided by the instructor. Students may also be asked to implement indirect blocks, list-based free block managment, or write a utility (like fsck) to check and repair the file system. Overview The MOSS File System Simulator is a collection of Java classes which simulate the file system calls available in a typical Unix-like operating system. The "Kernel" class contains methods (functions) like "creat()", "open()", "read()", "write()", "close()", etc., which read and write blocks in an underlying file in much the same way that a real file system would read and write blocks on an underlying disk device. In addition to the "Kernel" class, there are a number of underlying classes to support the implementation of the kernel. The classes FileSystem, IndexNode, DirectoryEntry, SuperBlock, Block, BitBlock, FileDescriptor, and Stat contain all data structures and algorithms which implement the simulated file system. Also included are a number of sample programs which can be used to operate on a simulated file system. The Java programs "ls", "cat", "mkdir", "mkfs", etc., perform file system operations to list directories, display files, create directories, and create (initialize) file systems. These programs illustrate the various file system calls and allow the user to carry out various read and write operations on the simulated file system. As mentioned above, there is a backing file for our simulated file system. A "dump" program is included with the distribution so that you can examine this file, byte-by-byte. Any dump program may be used (e.g., the "od" program in Unix); we include this one which is simple to use and understand, and can be used with any operating system. There are a number of ways you can use the simulator to get a better understanding of file systems. You can use the provided utility programs (mkfs, mkdir, ls, cat, etc.) to perform operations on the simulated file system and use the dump program to examine the underlying file and observe any changes, examine the sample utility programs to see how they use the system call interface to perform file operations, enhance the sample utility programs to provide additional functionality, write your own utility programs to extend the functionality of the simulated file system, and modify the underlying Kernel and other implementation classes to extend the functionality of the In the sections which follow, you will learn what you need to know to perform each of these activities. Using File System Simulator Programs Using mkfs The mkfs program creates a file system backing file. It does this by creating a file whose size is specified by the block size and number of blocks given. It writes the superblock, the free list blocks, the inode blocks, and the data blocks for a new file system. Note that it will overwrite any existing file of the name specified, so be careful when you use this program. This program is similar to the "mkfs" program found in Unix-like operating systems. The general format for the mkfs command is java mkfs file-name block-size blocks where file-name is the name of the backing file to create (e.g., filesys.dat). Note that this is the name of a real file, not a file in simulator. This is the file that the simulator uses to simulate the disk device for the simulated file system. This may be any valid file name in your operating system environment. block-size is the block size to be used for the file system (e.g., 256). This should be a multiple of the index node (i-node) size (usually 64) and the directory entry size (usually 16). Modern operating systems usually use a size of 1024, or 512 bytes. We use 128 or 256 byte block sizes in many of our examples so that you can quickly see what happens when directories grow beyond one block. This should be a decimal number not less than 64, but less than 32768. blocks is the number of blocks to create in the file system(e.g., 40). This number includes any blocks that may be used for the superblock, free list management, inodes, and data blocks. We use a relatively small number here so that you can quickly see what happens if you run out of disk space. This can be any decimal number greater than 3, but not greater than 224 - 1 (the maximum number of blocks), although you may not have sufficient space to create a very large file. For example, the command java mkfs filesys.dat 256 40 will create (or overwrite) a file "filesys.dat" so that it contains 40 256-byte blocks for a total of 10240 bytes. The output from the command should look something like this: block_size: 256 blocks: 40 super_blocks: 1 free_list_blocks: 1 inode_blocks: 8 data_blocks: 30 block_total: 40 From the output you can see that one block is needed for the superblock, one for free list management, eight for index nodes, and the remaining 30 are available for data blocks. Why is there 1 block for free list management? Note that 30 blocks require 30 bits in the free list bitmap. Since 256 bytes/block * 8 bits/byte = 2048 bits/block, clearly one bitmap block is sufficient to track block allocation for this file system. Why are there 8 blocks for index nodes? Note that 30 blocks could result in 30 inodes if many one-block files or directories are created. Since each inode requires 64 bytes, only 4 will fit in a block. Therefore, 8 blocks are set aside for up to 32 inodes. Using mkdir The mkdir program can be used to create new directories in our simulated file system. It does this by creating the file specified as a directory file, and then writing the directory entries for "." and ".." to the newly created file. Note that all directories leading to the new directory must already exist. This program is similar to the "mkdir" command in Unix-like and MS-DOS-related operating systems. The general format for the mkdir command is java mkdir directory-path where directory-path is the path of the directory to be created (e.g., "/root", or "temp", or "../home/rayo/moss/filesys"). If directory-path does not begin with a "/", then it is appended to the path name for working directory for the default process. For example, the command java mkdir /home creates a directory called "home" as a subdirectory of the root directory of the file system. Similarly, the command java mkdir /home/rayo creates a directory called "rayo" as a subdirectory of the "home" directory, which is presumed to already exist as a subdirectory of the root directory of the file system. Using ls The ls program is used to list information about files and directories in our simulated file system. For each file or directory name given it displays information about the files named, or in the case of directories, for each file in the directories named. This program is similar to the "ls" command in Unix-like operating systems, or the "dir" command in DOS-related operating systems. The general format for the ls command is java ls path-name ... where path-name ... is a space-separated list of one or more file or directory path names. For example, the command java ls /home lists the contents of the "/home" directory. For each file in the directory, a line is printed showing the name of the file or subdirectory, and other pertinent information such as size. The output from the command should look something like this: /home: 1 48 . 0 48 .. 2 32 rayo total files: 3 In this case we see that the "/home" directory contains entries for ".", "..", and "rayo". Using tee The tee program reads from standard input and writes whatever is read to both standard output and the named file. You can use this program to create files in our simulated file system with content created in the operating system environment. This program is similar to the "tee" command found in many Unix-like operating systems. The general format for the tee command is java tee file-path where file-path is the name of a file to be created in the simulated file system. If the named file already exists, it will be overwritten. For example, echo "howdy, podner" | java tee /home/rayo/hello.txt causes the single line "howdy, podner" to be written to the file "/home/rayo/hello.txt". The output from the command is howdy, podner which you should note was the same as the input sent to the tee program by the "echo" command. Note that the "|" (pipe) is almost always used with the tee program. Users of Unix-like operating systems will find the "echo", and "cat" commands useful to produce input for the pipe to tee. Users of MS-DOS-related operating systems will find the "echo" and "type" commands to be useful in this regard. If you wish to simply enter text directly to a file, then you may use tee directly (i.e., without the pipe). Users of Unix-like operating systems will need to use CTRL-D to signal the end of input. Users of MS-DOS-related operating systems will need to use CTRL-Z to signal the end of input. Using cp The cp program allows you to copy the contents from one file to another in our simulated file system. If the destination file already exists, it will be overwritten. This program is similar to the "cp" command in Unix-like operating systems, and the "copy" command in MS-DOS-related operating systems. The general format of the "cp" command is java cp input-file-name output-file-name where input-file-name is the path-name for the file to be copied (i.e., the source file, and output-file-name is the path-name for the file to be created (i.e., the target file. For example, java cp /home/rayo/hello.txt /home/rayo/greeting.txt creates a new file "/home/rayo/greeting.txt" by copying to it the contents of file "/home/rayo/hello.txt". Using cat The cat program reads the contents of a named file and writes it to standard output. The cat program is generally used to display the contents of a file. This program is similar to the "cat" command in Unix-like operating systems, or the "type" command in MS-DOS-related operating systems. The general format of the cat command line is java cat file-name where file-name is the name of the file from which data are to be read for writing to standard output. For example, java cat /home/rayo/greeting.txt causes the file "/home/rayo/greeting.txt" to be read, the contents of which are written to standard output. In this case, the output from the program might look something like this howdy, podner Dumping the File System While you are working with the file system simulator, you may wish to dump the contents of the backing file to see if it contains what you think it contains. The dump program shows the contents of a file in the operating environment, one byte at a time, in various formats (hexadecimal, decimal, ASCII). Note that dump dumps the contents of a real file, not a file in our simulated file system. The general format of the dump command line is java dump file-name where file-name is the name of the file to be dumped. This should generally be the name of the backing file for the file system simulator (e.g., "filesys.dat"). The general format of the dump output is addr hex dec asc where addr is the decimal address of the byte, hex is the hexadecimal value of the byte, dec is the decimal value of the byte, and asc is the corresponding ASCII character if the value is between 33 and 127 (decimal). Each line of dump output corresponds to a single byte in the file. To keep the listing brief, dump only displays non-zero bytes from the input file. For example java dump filesys.dat | more causes the contents of the file "filesys.dat" to be displayed, one line per byte. The "| more" causes you to be prompted for each page of the output. The first page of the output should look something like this: 0 1 1 5 28 40 ( 9 1 1 13 2 2 17 a 10 256 1f 31 512 40 64 @ 515 3 3 523 30 48 0 527 ff 255 528 ff 255 529 ff 255 530 ff 255 531 ff 255 532 ff 255 533 ff 255 534 ff 255 535 ff 255 536 ff 255 537 ff 255 538 ff 255 539 ff 255 540 ff 255 541 ff 255 You should notice, for example, that the first block (the super block) contains a few numeric values corresponding to the block size (the 1 in the 0 byte means 256), number of blocks, etc. The second block (starting at byte 256) contains a few bits that are set, indicating that the first few blocks are allocated. The third block (starting at 512) contains a few index nodes; the FF/255 values indicate that a direct block is unallocated. A little further down you will see ".", and ".." for the directory entries for the root file system, and other data blocks. Simulator Configuration File Each file system simulator program must call Kernel.initialize() before calling any of the other Kernel methods. The initialize() method reads a configuration file ("filesys.conf" is the default), opens the backing file for the file system ("filesys.dat" is the default), and performs other initializations. This section of the user guide describes the various options which may be set in the configuration file. Configuration File Options Name Description Default Value filesystem.root.filename The name of the file containing the root file system for the simulation. filesys.dat filesystem.root.mode The mode to use when opening the root file system backing file. The mode should either be "rw" for reading and writing, or "r" for read-only access. rw process.uid The numeric user id (uid) to use for the default process context. This should be a number between 0 and 32767. 1 process.gid The numeric group id (gid) to use for the default process context. This should be a number between 0 and 32767. 1 process.umask The umask to use for the default process context. This should be an octal number between 000 and 777. 022 process.dir The working directory in the simulated file system to be used for the default process context. This should be a string that starts with "/". /root process.max_open_files The maximum number of files that may be open at a time by a process. When a process context is created, this many slots are created for possible open files. 10 kernel.max_open_files The maximum number of files that may be open at one time by all processes in the simulation. When the simulator starts, this many slots are created for possible open files. 20 A Sample Configuration File In addition to the standard configuration file, "filesys.conf", the distribution also includes a smaller sample configuration file, "sample.conf". This is shown below to illustrate a typical configuration file. ! ! my personal filesys configuration file ! filesystem.root.filename = rayo.dat filesystem.root.mode = r process.uid = 1000 process.gid = 1000 process.umask = 002 process.dir = /home/rayo In this particular example, the file system is contained in the backing file "rayo.dat", which is here being opened for read-only access. The working directory for the default process context is "/home/rayo", with the uid, gid, and umask shown. Specifying an Alternate Configuration File The default configuration file is named "filesys.conf" and is included in the application distribution. You may modify this file directly to set various options, or you may create your own configuration file and specify the name of this new file when you launch your simulator programs. If you choose to create your own configuration file, you will need to define a system property "filesys.conf" which contains the name of file. For example, suppose you wanted to run the "ls" program using "my_filesys.conf" as the configuration file. Your java command would look something like this: java -Dfilesys.conf=my_filesys.conf ls /home If there is no value set for the "filesys.conf" system property, then the name "filesys.conf" is used as the default configuration filename. Writing File System Simulator Programs Writing programs that use the File System Simulator requires the use of the Kernel class, and may involve the use of the classes Stat and DirectoryEntry. If you're writing ordinary programs that use the standard file system calls, you should not need to reference any other classes. These three classes are described briefly here. For more information, follow the link for the class to the javadoc for that class. Kernel sets up the simulator environment and defines all the system calls. This class defines: the method initialize(), which is used to initialize the file system simulator; the creat(), open(), read(), write(), close(), and other methods which simulate the work of a file system; and constants like EBADF, S_IFDIR, and O_RDONLY which are used to represent parameter or return values for the system calls. All the methods and fields of Kernel are static; you do not instantiate a Kernel object. For examples, see any of the sample programs (i.e., cat.java, cp.java, ls.java, etc.) Stat is a data structure that represents information about a file or directory. This intends to faithfully represent the Unix stat struct. You may reference fields within a stat object directly (e.g., stat.st_ino), or using JavaBean-style accessor/mutator methods (e.g., stat.getIno() or stat.setIno(). Stat objects are updated by the methods Kernel.stat() and Kernel.fstat(). For examples, see mkdir.java. DirectoryEntry is a data structure that represents a single record in a directory file. This intends to faithfully represent a Unix dirent struct. It contains an index node number and a file name. You may reference the fields directly (e.g., dirent.d_ino), or using JavaBean-style accessor/mutator methods (e.g., dirent.getIno() or dirent.setIno()). However, Java programmers my find it more convenient to use the getName() and setName() (which use String) instead of the field d_name (which is byte[]). DirectoryEntry objects are updated by the method Kernel.readdir(). For examples, see mkdir.java and ls.java. For more information about Unix system calls and the stat and dirent structs, refer to a Unix system manual. Users of Unix-like systems may find the commands "man -S 2 creat", "man -S 2 open", etc. to be helpful. All programs that use the File System Simulator should adhere to the following guidelines: Invoke the method Kernel.initialize() before any other File System Simulator calls. Use Kernel.exit() when you wish to terminate processing in your program. Check for errors after each system call (e.g., creat(), open(), read(), write(), etc.). Nearly all the system calls return -1 if an error occurs. Use Kernel.perror() to print the message associated with an error. Use Kernel.getErrno() to determine which error occurred, if needed. Note that in standard Unix programs you would reference the static process variable "errno". For examples, take a look at the following sample programs in the distribution: cat.java cp.java ls.java mkdir.java tee.java Collectively, these sample programs invoke all of the core methods (system calls) of the file system simulator. Enhancing the File System Simulator Adding new features to the File System Simulator is an excellent way to probe your understanding of file system operation, and to investigate new features. Enhancements will almost certainly require changes to the class Kernel, and may necessitate changes to the sample programs described above. This section describes the other classes that implement the functionality of the simulator so that you may understand the intended organization of these components when making a proposed enhancement. The following are the internal classes for the file system simulator: BitBlock is a data structure that views a device block as a sequence of bits. The methods setBit(), resetBit(), and isBitSet() are used to set, reset, or check a bit in the block. This structure is used to implement bitmaps, and is used by the file system simulator to track allocated and free data blocks in the file system. BitBlock extends Block. Block is a data structure that views a device block as a sequence of bytes. The field bytes is an array of byte, and is directly accessible. Included are methods to read() and write() the block to a java.io.RandomAccessFile, which simulate the action of reading or writing a device block. FileDescriptor is a structure and collection of methods that represent an open file. It includes a number of get and set methods for various tidbits of information about the open file, and provides readBlock and writeBlock() methods for reading and writing the blocks of the file. FileSystem is a structure and collection of methods that represent an open (mounted) file system. It includes a few get and set methods for various fields about the file system, but more importantly, includes methods to open() the file behind the file system, to read() and write() blocks of the device, to manage blocks (allocateBlock() and freeBlock()) and to manage inodes (allocateIndexNode()). In general, Kernel methods should call FileSystem methods when they want to read or write data in the file system. IndexNode is a structure and collection of methods for representing an index node. This is meant to reflect the exact structure on disk for an index node. It includes get and set methods for each of the fields in the index node. Also included are read() and write() methods which are used to copy data to and from byte arrays (not disk files). ProcessContext is a structure and collection of methods to represent a process. This is where the simulator stores the uid, gid, umask, dir, and other information for the current process. It includes get and set methods for each of the fields in a process. SuperBlock is a structure and collection of methods for representing the superblock on the disk. In our implementation, the superblock contains information about the block size, number of blocks, offsets to the first block of the free list, inode block, and data block areas of the device. It includes get and set methods for each of the fields in the superblock. Also included are methods to read() and write() the superblock. Of course, you should look at the code and plan your enhancements carefully. Suggested Exercises Use mkfs to create a file system with a block size of 64 bytes and having a total of 8 blocks. How many index nodes will fit in a block? How many directory entries will fit in a block? Use dump to examine the file system backing file, and note the value in byte 64. What does this value represent? Use mkdir to create a directory (e.g., /usr), and then use dump to examine byte 64 again. What do you notice? Repeat the process of creating a directory (e.g., /bin, /lib, /var, /etc, /home, /mnt, etc.) and examining with dump. How many directories can you create before you fill up the file system? Explain why. filesys/FileSystem.class public synchronized class FileSystem { private java.io.RandomAccessFile file; private String filename; private String mode; private short blockSize; private int blockCount; private int freeListBlockOffset; private int inodeBlockOffset; private int dataBlockOffset; private IndexNode rootIndexNode; public static short ROOT_INDEX_NODE_NUMBER; public static int NOT_A_BLOCK; private int currentFreeListBitNumber; private int currentFreeListBlock; private BitBlock freeListBitBlock; private short currentIndexNodeNumber; private short currentIndexNodeBlock; private byte[] indexBlockBytes; public void FileSystem(String, String) throws java.io.IOException; public short getBlockSize(); public int getFreeListBlockOffset(); public int getInodeBlockOffset(); public int getDataBlockOffset(); public IndexNode getRootIndexNode(); public void open() throws java.io.IOException; public void close() throws java.io.IOException; public void read(byte[], int) throws java.io.IOException; public void write(byte[], int) throws java.io.IOException; public void freeBlock(int) throws java.io.IOException; public int allocateBlock() throws java.io.IOException; private void loadFreeListBlock(int) throws java.io.IOException; public short allocateIndexNode() throws java.io.IOException; public void readIndexNode(IndexNode, short) throws java.io.IOException; public void writeIndexNode(IndexNode, short) throws java.io.IOException; private void loadIndexNodeBlock(short) throws java.io.IOException; static void ();
}

filesys/FileSystem.java

/**

* A simulated file system.

import
java
.
io
.
RandomAccessFile

;

import
java
.
io
.
IOException

;

public

class

FileSystem

{

private

RandomAccessFile
file
=

null

;

private

String
filename
=

null

;

private

String
mode
=

null

;

private

short
blockSize
=

0

;

private

int
blockCount
=

0

;

private

int
freeListBlockOffset
=

0

;

private

int
inodeBlockOffset
=

0

;

private

int
dataBlockOffset
=

0

;

private

IndexNode
rootIndexNode
=

null

;

public

static

short
ROOT_INDEX_NODE_NUMBER
=

0

;

public

static

int
NOT_A_BLOCK
=

0x00FFFFFF

;

/**

* Construct a FileSystem and open a FileSystem file.

*
@param
newFilename the name of the FileSystem file to open

*
@param
newMode the mode (“r” or “rw”) to use when opening the file

*
@exception
java.io.IOException if any IOExceptions are thrown

* during the open.

public

FileSystem
(

String
newFilename
,

String
newMode
)

throws

IOException

{

super
()

;

filename
=
newFilename
;

mode
=
newMode
;

open
()

;

}

/**

* Get the blockSize for this FileSystem.

*
@return
the block size in bytes

public

short
getBlockSize
()

{

return
blockSize
;

}

public

int
getFreeListBlockOffset
()

{

return
freeListBlockOffset
;

}

public

int
getInodeBlockOffset
()

{

return
inodeBlockOffset
;

}

public

int
getDataBlockOffset
()

{

return
dataBlockOffset
;

}

/**

* Get the rootIndexNode for this FileSystem.

*
@return
the root index node

public

IndexNode
getRootIndexNode
()

{

return
rootIndexNode
;

}

/**

* Open a backing file for this FileSystem and read the superblock.

*
@exception
java.io.IOException if the open or read causes

* IOException to be thrown

public

void
open
()

throws

IOException

{

file
=

new

RandomAccessFile
(
filename
,
mode
)

;

// read the block size and other information from the superblock

SuperBlock
superBlock
=

new

SuperBlock
()

;

superBlock
.
read
(
file
)

;

blockSize
=
superBlock
.
getBlockSize
()

;

blockCount
=
superBlock
.
getBlocks
()

;

// ??? inodeCount

freeListBlockOffset
=
superBlock
.
getFreeListBlockOffset
()

;

inodeBlockOffset
=
superBlock
.
getInodeBlockOffset
()

;

dataBlockOffset
=
superBlock
.
getDataBlockOffset
()

;

// initialize free list block buffer

freeListBitBlock
=

new

BitBlock
(
blockSize
)

;

// initialize index block buffer

indexBlockBytes
=

new

byte
[
blockSize
]

;

// read the root index node

rootIndexNode
=

new

IndexNode
()

;

readIndexNode
(
rootIndexNode
,
ROOT_INDEX_NODE_NUMBER
)

;

}

/**

* Close the backing file for this FileSystem, if any.

*
@exception
java.io.IOException if the closing the backing

* file causes any IOException to be thrown

public

void
close
()

throws

IOException

{

if
(
file
!=

null

)

file
.
close
()

;

}

/**

* Read bytes into a buffer from the specified absolute block number

* of the file system.

*
@param
bytes the byte buffer into which the block should be read

*
@param
blockNumber the absolute block number which should be read

*
@exception
java.io.IOException if there are any exceptions during

* the read from the underlying “file system” file.

public

void
read
(

byte
[]
bytes
,

int
blockNumber
)

throws

IOException

{

file
.
seek
(
blockNumber
*
blockSize
)

;

file
.
readFully
(
bytes
)

;

}

/**

* Write bytes from a buffer to the specified absolute block number

* of the file system.

*
@param
bytes the byte buffer from which the block should be written

*
@param
blockNumber the absolute block number which should be written

*
@exception
java.io.IOException if there are any exceptions during

* the write to the underlying “file system” file.

public

void
write
(

byte
[]
bytes
,

int
blockNumber
)

throws

IOException

{

file
.
seek
(
blockNumber
*
blockSize
)

;

file
.
write
(
bytes
)

;

}

private

int
currentFreeListBitNumber
=

0

;

private

int
currentFreeListBlock
=

–
1

;

private

BitBlock
freeListBitBlock
=

null

;

/**

* Mark a data block as being free in the free list.

*
@param
dataBlockNumber the data block which is to be marked free

*
@exception
java.io.IOException if any exception occurs during an

* operation on the underlying “file system” file.

public

void
freeBlock
(

int
dataBlockNumber
)

throws

IOException

{

loadFreeListBlock
(
dataBlockNumber
)

;

freeListBitBlock
.
resetBit
(
dataBlockNumber
%

(
blockSize
*

8

)

)

;

file
.
seek
(

(
freeListBlockOffset
+
currentFreeListBlock
)

*

blockSize
)

;

freeListBitBlock
.
write
(
file
)

;

}

/**

* Allocate a data block from the list of free blocks.

*
@return
the data block number which was allocated; -1 if no blocks

* are available

*
@exception
java.io.IOException if any exception occurs during an

* operation on the underlying “file system” file.

public

int
allocateBlock
()

throws

IOException

{

// from our current position in the free list block,

// scan until we find an open position. If we get back to

// where we started, there are no free blocks and we return

// -1.

int
save
=
currentFreeListBitNumber
;

while
(

true

)

{

loadFreeListBlock
(
currentFreeListBitNumber
)

;

boolean
allocated
=
freeListBitBlock
.
isBitSet
(

currentFreeListBitNumber
%

(
blockSize
*

8

)

)

;

int
previousFreeListBitNumber
=
currentFreeListBitNumber
;

currentFreeListBitNumber
++

;

// if curr bit number >= data block count, set to 0

if
(
currentFreeListBitNumber
>=

(
blockCount
–
dataBlockOffset
)

)

currentFreeListBitNumber
=

0

;

if
(

!
allocated
)

{

freeListBitBlock
.
setBit
(
previousFreeListBitNumber
%

(
blockSize
*

8

)

)

;

file
.
seek
(

(
freeListBlockOffset
+
currentFreeListBlock
)

*

blockSize
)

;

freeListBitBlock
.
write
(
file
)

;

return
previousFreeListBitNumber
;

}

if
(
save
==
currentFreeListBitNumber
)

{

Kernel
.
setErrno
(

Kernel
.
ENOSPC
)

;

return

–
1

;

}

/**

* Loads the block containing the specified data block bit into

* the free list block buffer. This is a convenience method.

*
@param
dataBlockNumber the data block number

*
@exception
java.io.IOException

private

void
loadFreeListBlock
(

int
dataBlockNumber
)

throws

IOException

{

int
neededFreeListBlock
=
dataBlockNumber
/

(
blockSize
*

8

)

;

if
(
currentFreeListBlock
!=
neededFreeListBlock
)

{

file
.
seek
(

(
freeListBlockOffset
+
neededFreeListBlock
)

*

blockSize
)

;

freeListBitBlock
.
read
(
file
)

;

currentFreeListBlock
=
neededFreeListBlock
;

}

/**

* The index node number that will next be checked to see

* if it is available.

private

short
currentIndexNodeNumber
=

0

;

/**

* The number of the index node block which is currently

* loaded into indexBlockBytes. If no block is loaded,

* this contains the value “-1”.

private

short
currentIndexNodeBlock
=

–
1

;

/**

* The byte buffer used for reading and writing

* index node blocks. You can think of this as

* a one-block cache.

private

byte
[]
indexBlockBytes
=

null

;

/**

* Allocate an index node for the file system.

*
@return
the inode number for the next available index node;

* -1 if there are no index nodes available.

*
@exception
java.io.IOException if there is an exception during

* an operation on the underlying “file system” file.

public

short
allocateIndexNode
()

throws

IOException

{

// from our current position in the index node block list,

// scan until we find an open position. If we get back to

// where we started, there are no free inodes and we return

// -1.

short
save
=
currentIndexNodeNumber
;

IndexNode
temp
=

new

IndexNode
()

;

while
(

true

)

{

readIndexNode
(
temp
,
currentIndexNodeNumber
)

;

short
previousIndexNodeNumber
=
currentIndexNodeNumber
;

currentIndexNodeNumber
++

;

// if curr inode >= avail inode space, set to 0

if
(
currentIndexNodeNumber
>=

(

(
dataBlockOffset
–
inodeBlockOffset
)

*

(
blockSize
/

IndexNode
.
INDEX_NODE_SIZE
)

)

)

currentIndexNodeNumber
=

0

;

if
(
temp
.
getNlink
()

==

0

)

{

// ??? should we update nlinks here?

return
previousIndexNodeNumber
;

}

if
(
save
==
currentIndexNodeNumber
)

{

// ??? it seems like we should give a different error here

Kernel
.
setErrno
(

Kernel
.
ENOSPC
)

;

return

–
1

;

}

/**

* Reads an index node at the index node location specified.

*
@param
indexNode the index node

*
@param
indexNodeNumber the location

*
@execption
java.io.IOException if any exception occurs in an

* underlying operation on the “file system” file.

public

void
readIndexNode
(

IndexNode
indexNode
,

short
indexNodeNumber
)

throws

IOException

{

loadIndexNodeBlock
(
indexNodeNumber
)

;

indexNode
.
read
(
indexBlockBytes
,

(
indexNodeNumber
*

IndexNode
.
INDEX_NODE_SIZE
)

%

blockSize
)

;

}

/**

* Writes an index node at the index node location specified.

*
@param
indexNode the index node

*
@param
indexNodeNumber the location

*
@execption
java.io.IOException if any exception occurs in an

* underlying operation on the “file system” file.

public

void
writeIndexNode
(

IndexNode
indexNode
,

short
indexNodeNumber
)

throws

IOException

{

loadIndexNodeBlock
(
indexNodeNumber
)

;

indexNode
.
write
(
indexBlockBytes
,

(
indexNodeNumber
*

IndexNode
.
INDEX_NODE_SIZE
)

%

blockSize
)

;

write
(
indexBlockBytes
,
inodeBlockOffset
+
currentIndexNodeBlock
)

;

}

/**

* Loads the block containing the specified index node into

* the index node block buffer. This is a convenience method.

*
@param
indexNodeNumber the index node number

*
@exception
java.io.IOException

private

void
loadIndexNodeBlock
(

short
indexNodeNumber
)

throws

IOException

{

short
neededIndexNodeBlock
=

(
short
)(
indexNodeNumber
/

(
blockSize
/

IndexNode
.
INDEX_NODE_SIZE
)

)

;

if
(
currentIndexNodeBlock
!=
neededIndexNodeBlock
)

{

read
(
indexBlockBytes
,
inodeBlockOffset
+
neededIndexNodeBlock
)

;

currentIndexNodeBlock
=
neededIndexNodeBlock
;

}

__MACOSX/filesys/._FileSystem.java

filesys/IndexNode.java

/**

* An index node for a simulated file system.

public

class

IndexNode

{

/**

* Size of each index node in bytes.

public

static

final

int
INDEX_NODE_SIZE
=

64

;

/**

* Maximum number of direct blocks in an index node.

public

static

final

int
MAX_DIRECT_BLOCKS
=

10

;

/**

* Maximum number of blocks in a file. If indirect,

* doubleIndirect, or tripleIndirect blocks are implemented,

* this number will need to be increased.

public

static

final

int
MAX_FILE_BLOCKS
=
MAX_DIRECT_BLOCKS
;

/**

* Mode for this index node. This includes file type and file protection

* information.

private

short
mode
=

0

;

* Not yet implemented.

* Number of links to this file.

private

short
nlink
=

0

;

* Not yet implemented.

* Owner’s user id.

private

short
uid
=

0

;

* Not yet implemented.

* Owner’s group id.

private

short
gid
=

0

;

/**

* Number of bytes in this file.

private

int
size
=

0

;

/**

* Array of direct blocks containing the block addresses for the

* first MAX_DIRECT_BLOCKS blocks of the file. Note that each

* element in the array is stored as a 3-byte number on disk.

private

int
directBlocks
[]

=

{

FileSystem
.
NOT_A_BLOCK

,

FileSystem
.
NOT_A_BLOCK

,

FileSystem
.
NOT_A_BLOCK
}

;

* Not yet implemented.

private

int
indirectBlock
=

FileSystem
.
NOT_A_BLOCK
;

* Not yet implemented.

private

int
doubleIndirectBlock
=

FileSystem
.
NOT_A_BLOCK
;

* Not yet implemented.

private

int
tripleIndirectBlock
=

FileSystem
.
NOT_A_BLOCK
;

* Not yet implemented.

* The date and time at which this file was last accessed.

* This is traditionally implemented as the number of seconds

* past 1970/01/01 00:00:00

private

int
atime
=

0

;

* Not yet implemented.

* The date and time at which this file was last modified.

* This is traditionally implemented as the number of seconds

* past 1970/01/01 00:00:00

private

int
mtime
=

0

;

* Not yet implemented.

* The date and time at which this file was created.

* This is traditionally implemented as the number of seconds

* past 1970/01/01 00:00:00

private

int
ctime
=

0

;

/**

* Creates an index node.

public

IndexNode
()

{

super
()

;

}

/**

* Sets the mode for this IndexNode.

* This is the file type and file protection information.

public

void
setMode
(

short
newMode
)

{

mode
=
newMode
;

}

/**

* Gets the mode for this IndexNode.

* This is the file type and file protection information.

public

short
getMode
()

{

return
mode
;

}

/**

* Set the number of links for this IndedNode.

*
@param
newNlink the number of links

public

void
setNlink
(

short
newNlink
)

{

nlink
=
newNlink
;

}

/**

* Get the number of links for this IndexNode.

*
@return
the number of links

public

short
getNlink
()

{

return
nlink
;

}

public

void
setUid
(

short
newUid
)

{

uid
=
newUid
;

}

public

short
getUid
()

{

return
uid
;

}

public

short
getGid
()

{

return
gid
;

}

public

void
setGid
(

short
newGid
)

{

gid
=
newGid
;

}

/**

* Sets the size for this IndexNode.

* This is the number of bytes in the file.

public

void
setSize
(

int
newSize
)

{

size
=
newSize
;

}

/**

* Gets the size for this IndexNode.

* This is the number of bytes in the file.

public

int
getSize
()

{

return
size
;

}

/**

* Gets the address corresponding to the specified

* sequential block of the file.

*
@param
block the sequential block number

*
@return
the address of the block, a number between zero and one

* less than the number of blocks in the file system

*
@exception
java.lang.Exception if the block number is invalid

public

int
getBlockAddress
(

int
block
)

throws

Exception

{

buffer
[
offset
+
1
]

=

(
byte
)
mode
;

// write nlink

buffer
[
offset
+
2
]

=

(
byte
)(
nlink
>>>

8

)

;

buffer
[
offset
+
3
]

=

(
byte
)
nlink
;

// write uid

buffer
[
offset
+
4
]

=

(
byte
)(
uid
>>>

8

)

;

buffer
[
offset
+
5
]

=

(
byte
)
uid
;

// write gid

buffer
[
offset
+
6
]

=

(
byte
)(
gid
>>>

8

)

;

buffer
[
offset
+
7
]

=

(
byte
)
gid
;

// write the size info

buffer
[
offset
+
8
]

=

(
byte
)(
size
>>>

24

)

;

buffer
[
offset
+
8
+
1
]

=

(
byte
)(
size
>>>

16

)

;

buffer
[
offset
+
8
+
2
]

=

(
byte
)(
size
>>>

8

)

;

buffer
[
offset
+
8
+
3
]

=

(
byte
)(
size
)

;

// write the directBlocks info 3 bytes at a time

for
(

int
i
=

0

;
i
< MAX_DIRECT_BLOCKS ; i ++ ) { buffer [ offset + 12 + 3 * i ] = ( byte )( directBlocks [ i ] >>>

16

)

;

buffer
[
offset
+
12
+
3
*
i
+
1
]

=

(
byte
)(
directBlocks
[
i
]

>>>

8

)

;

buffer
[
offset
+
12
+
3
*
i
+
2
]

=

(
byte
)(
directBlocks
[
i
]

)

;

}

// leave room for indirectBlock, doubleIndirectBlock, tripleIndirectBlock

// leave room for atime, mtime, ctime

}

/**

* Reads the contents of an index node from a byte array.

* This is used to copy the bytes which correspond to the

* disk image of the index node from a block buffer that

* has been read from the file system.

*
@param
buffer the buffer from which bytes should be read

*
@param
offset the offset from the beginning of the buffer

* at which bytes should be read

public

void
read
(

byte
[]
buffer
,

int
offset
)

{

int
b3
;

int
b2
;

int
b1
;

int
b0
;

// read the mode info

b1
=
buffer
[
offset
]

&

0xff

;

b0
=
buffer
[
offset
+
1
]

&

0xff

;

s
.
append
(

‘,’

)

;

s
.
append
(
directBlocks
[
i
]

)

;

}

s
.
append
(

‘}’

)

;

s
.
append
(

‘]’

)

;

return
s
.
toString
()

;

}

public

void
copy
(

IndexNode
indexNode
)

{

indexNode
.
mode
=
mode
;

indexNode
.
nlink
=
nlink
;

indexNode
.
uid
=
uid
;

indexNode
.
gid
=
gid
;

indexNode
.
size
=
size
;

filesys/Kernel.java

* $Id: Kernel.java

* 456789012345678901234567890123456789012345678901234567890123456789012

import
java
.
util
.
StringTokenizer

;

import
java
.
util
.
Properties

;

import
java
.
io
.
FileInputStream

;

import
java
.
io
.
IOException

;

import
java
.
io
.
FileNotFoundException

;

/**

* Simulates a unix-like file system. Provides basic directory

* and file operations and implements them in terms of the underlying

* disk block structures.

public

class

Kernel

{

/**

* The name this program uses when displaying any error messages

* it generates internally.

public

static

final

String
PROGRAM_NAME
=

“Kernel”

;

/* Errors */

/**

* Not owner.

public

static

final

int
EPERM
=

1

;

/**

* No such file or directory.

public

static

final

int
ENOENT
=

2

;

/**

* Bad file number.

public

static

final

int
EBADF
=

9

;

/**

* Permission denied.

public

static

final

int
EACCES
=

13

;

/**

* File exists.

public

static

final

int
EEXIST
=

17

;

/**

* Cross-device link.

public

static

final

int
EXDEV
=

18

;

/**

* Not a directory.

public

static

final

int
ENOTDIR
=

20

;

/**

* Is a directory.

public

static

final

int
EISDIR
=

21

;

/**

* Invalid argument.

public

static

final

int
EINVAL
=

22

;

/**

* File table overflow.

public

static

final

int
ENFILE
=

23

;

/**

* Too many open files.

public

static

final

int
EMFILE
=

24

;

/**

* File too large.

public

static

final

int
EFBIG
=

27

;

/**

* No space left on device.

public

static

final

int
ENOSPC
=

28

;

/**

* Read-only file system.

public

static

final

int
EROFS
=

30

;

/**

* Too many links.

public

static

final

int
EMLINK
=

31

;

/**

* Number of errors messages defined in sys_errlist

* Simulates unix system variable:


   *   int sys_nerr;

   *

public

static

final

int
sys_nerr
=

32

;

/**

* The array of kernel error messages.

* Simulates unix system variable:


   *   const char *sys_errlist[];

   *

public

static

final

String
[]
sys_errlist
=

{

null

,

“Not owner”

,

“No such file or directory”

,

null

,

“Bad file number”

,

null

,

“Permission denied”

,

null

,

“File exists”

,

“Cross-device link”

,

null

,

“Not a directory”

,

“Is a directory”

,

“Invalid argument”

,

“File table overflow”

,

“Too many open files”

,

null

,

“File too large”

,

“No space left on device”

,

null

,

“Read-only file system”

,

“Too many links”

}

;

/**

* Prints a system error message. The actual text written

* to stderr is the

* given string, followed by a colon, a space, the message

* text, and a newline. It is customary to give the name of

* the program as the argument to perror.

*
@param
s the program name

public

static

void
perror
(

String
s
)

{

String
message
=

null

;

* Simulates the unix variable:


   *   extern int errno ;

   *

*
@see
getErrno

public

static

void
setErrno
(

int
newErrno
)

{

if
(
process
==

null

)

{

System
.
err
.
println
(
PROGRAM_NAME
+

“: no current process in setErrno()”

)

;

System
.
exit
(
EXIT_FAILURE
)

;

}

process
.
errno
=
newErrno
;

}

/**

* Get the value of errno for the current process.

* Simulates the unix variable:


   *   extern int errno ;

   *

*
@see
setErrno

public

static

int
getErrno
()

{

if
(
process
==

null

)

{

System
.
err
.
println
(
PROGRAM_NAME
+

“: no current process in getErrno()”

)

;

System
.
exit
(
EXIT_FAILURE
)

;

}

return
process
.
errno
;

}

/* Modes */

/**

* File type mask

public

static

final

short
S_IFMT
=

(
short
)
0170000

;

/**

* Regular file

public

static

final

short
S_IFREG
=

(
short
)
0100000

;

/**

* Multiplexed block special

public

static

final

short
S_IFMPB
=

070000

;

/**

* Block Special

public

static

final

short
S_IFBLK
=

060000

;

/**

* Directory

public

static

final

short
S_IFDIR
=

040000

;

/**

* Multiplexed character special

public

static

final

short
S_IFMPC
=

030000

;

/**

* Character special

public

static

final

short
S_IFCHR
=

020000

;

/**

* Set user id on execution

public

static

final

short
S_ISUID
=

04000

;

/**

* Set group id on execution

public

static

final

short
S_ISGID
=

02000

;

/**

* Save swapped text even after use

public

static

final

short
S_ISVTX
=

01000

;

/**

* User (file owner) has read, write and execute permission

public

static

final

short
S_IRWXU
=

0700

;

/**

* User has read permission

public

static

final

short
S_IRUSR
=

0400

;

/**

* User has read permission

public

static

final

short
S_IREAD
=

0400

;

/**

* User has write permission

public

static

final

short
S_IWUSR
=

0200

;

/**

* User has write permission

public

static

final

short
S_IWRITE
=

0200

;

/**

* User has execute permission

public

static

final

short
S_IXUSR
=

0100

;

/**

* User has execute permission

public

static

final

short
S_IEXEC
=

0100

;

/**

* Group has read, write and execute permission

public

static

final

short
S_IRWXG
=

070

;

/**

* Group has read permission

public

static

final

short
S_IRGRP
=

040

;

/**

* Group has write permission

public

static

final

short
S_IWGRP
=

020

;

/**

* Group has execute permission

public

static

final

short
S_IXGRP
=

010

;

/**

* Others have read, write and execute permission

public

static

final

short
S_IRWXO
=

07

;

/**

* Others have read permission

public

static

final

short
S_IROTH
=

04

;

/**

* Others have write permisson

public

static

final

short
S_IWOTH
=

02

;

/**

* Others have execute permission

public

static

final

short
S_IXOTH
=

01

;

/**

* Closes the specified file descriptor.

* Simulates the unix system call:


   *   int close(int fd);

   *

*
@param
fd the file descriptor of the file to close

*
@return
Zero if the file is closed; -1 if the file descriptor

* is invalid.

public

static

int
close
(
int
fd
)

{

// check fd

int
status
=
check_fd
(
fd
)

;

* Creates a new file or prepares to rewrite an existing file.

* If the file does not exist, it is given the mode specified.

* If the file does exist, it is truncated to length zero.

* The file is opened for writing and its file descriptor is

* returned.

* Simulates the unix system call:


   *   int creat(const char *pathname, mode_t mode);

   *

*
@param
pathname the name of the file or directory to create

*
@param
mode the file or directory protection mode for the new file

*
@return
the file descriptor (a non-negative integer); -1 if

* a needed directory is not searchable, if the file does not

* exist and the directory in which it is to be created is not

* writable, if the file does exist and is unwritable, if the

* file is a directory, or if there are already too many open

* files.

*
@exception
java.lang.Exception if any underlying action causes

* an exception to be thrown

public

static

int
creat
(

String
pathname
,

short
mode
)

throws

Exception

{

// get the full path

String
fullPath
=
getFullPath
(
pathname
)

;

StringBuffer
dirname
=

new

StringBuffer
(

“/”

)

;

FileSystem
fileSystem
=
openFileSystems
[
ROOT_FILE_SYSTEM
]

;

IndexNode
currIndexNode
=
getRootIndexNode
()

;

IndexNode
prevIndexNode
=

null

;

short
indexNodeNumber
=

FileSystem
.
ROOT_INDEX_NODE_NUMBER
;

StringTokenizer
st
=

new

StringTokenizer
(
fullPath
,

“/”

)

;

String
name
=

“.”

;

// start at root node

while
(
st
.
hasMoreTokens
()

)

{

name
=
st
.
nextToken
()

;

if

(

!
name
.
equals
(
“”
)

)

{

// check to see if the current node is a directory

if
(

(
currIndexNode
.
getMode
()

&
S_IFMT
)

!=
S_IFDIR
)

{

// return (ENOTDIR) if a needed directory is not a directory

process
.
errno
=
ENOTDIR
;

return

–
1

;

}

// check to see if it is readable by the user

// ??? tbd

// return (EACCES) if a needed directory is not readable

if
(
st
.
hasMoreTokens
()

)

{

dirname
.
append
(
name
)

;

dirname
.
append
(

‘/’

)

;

}

// get the next inode corresponding to the token

prevIndexNode
=
currIndexNode
;

currIndexNode
=

new

IndexNode
()

;

indexNodeNumber
=
findNextIndexNode
(

fileSystem
,
prevIndexNode
,
name
,
currIndexNode
)

;

}

// ??? we need to set some fields in the file descriptor

int
flags
=
O_WRONLY
;

// ???

FileDescriptor
fileDescriptor
=

null

;

{

int seek_status = lseek( dir , – DirectoryEntry.DIRECTORY_ENTRY_SIZE , 1 ) ;

{

DirectoryEntry nextDirectoryEntry = new DirectoryEntry() ;

// read next item status = readdir( dir , nextDirectoryEntry ) ;

if( status > 0 )

{

// in its place int seek_status = lseek( dir , – DirectoryEntry.DIRECTORY_ENTRY_SIZE , 1 ) ;

// if it’s a directory, we can’t truncate it if( ( currIndexNode.getMode() & S_IFMT ) == S_IFDIR )

{

// return (EISDIR) if the file is a directory process.errno = EISDIR ;

return -1 ;

}

// check to see if the file is writeable by the user // ??? tbd // return (EACCES) if the file does exist and is unwritable

// free any blocks currently allocated to the file int blockSize = fileSystem.getBlockSize() ;

int blocks = ( currIndexNode.getSize() + blockSize – 1 ) /

blockSize ;

* Simulates the unix system call: *

   *   exit(int status);   *

* Note: If this is the last process to terminate, this method * calls finalize(). * @param status the exit status * @exception java.lang.Exception if any underlying * Exception is thrown */

public static void exit( int status )

throws Exception

{

* Simulates the unix system call: *

   *   lseek( int filedes , int offset , int whence );   *

* @param fd the file descriptor * @param offset the offset * @param whence 0 = from beginning of file; 1 = from * current position ; 2 = from end of file */

public static int lseek( int fd , int offset , int whence )

{

// check fd int status = check_fd( fd ) ;

* The file is positioned at the beginning (byte 0). * The returned file descriptor must be used for subsequent * calls for other input and output functions on the file. *

* Simulates the unix system call: *

   *   int open(const char *pathname, int flags );   *

public static int open( String pathname , int flags )

throws Exception

{

// get the full path name String fullPath = getFullPath( pathname ) ;

IndexNode indexNode = new IndexNode() ;

short indexNodeNumber = findIndexNode( fullPath , indexNode ) ;

* This is a convenience method for the simulator kernel. * @param fileDescriptor the file descriptor * @return the file descriptor index in the process open file * list assigned to this open file */

private static int open( FileDescriptor fileDescriptor )

{

// scan the kernel open file list for a slot // and add our new file descriptor int kfd = -1 ;

* Simulates the unix system call: *

   *   int read(int fd, void *buf, size_t count);   *

public static int read( int fd , byte[] buf , int count )

throws Exception

{

// check fd int status = check_fd_for_read( fd ) ;

break ;

// if this is the first time through the loop, // or if we’re at the beginning of a block, load the data block if( ( i == 0 ) || ( ( offset % blockSize ) == 0 ) )

{

status = file.readBlock( (short)( offset / blockSize ) ) ;

* Simulates the unix system call: *

   *   int readdir(unsigned int fd, struct dirent *dirp ) ;   *

public static int readdir( int fd , DirectoryEntry dirp ) throws Exception

{

// check fd int status = check_fd_for_read( fd ) ;

return 0 ;

// read a block, if needed status = file.readBlock( (short)( file.getOffset() / file.getBlockSize() ) ) ;

* Simulates the unix system call: *

   *   int fstat(int filedes, struct stat *buf);   *

* @exception java.lang.Exception if any underlying action causes * Exception to be thrown */

public static int fstat( int fd , Stat buf )

throws Exception

{

// check fd int status = check_fd( fd ) ;

* Simulates the unix system call: *

   *   int stat(const char *name, struct stat *buf);   *

* @exception java.lang.Exception if any underlying action causes * Exception to be thrown */

public static int stat( String name , Stat buf )

throws Exception

{

// a buffer for reading directory entries DirectoryEntry directoryEntry = new DirectoryEntry() ;

// get the full path String path = getFullPath( name ) ;

// find the index node IndexNode indexNode = new IndexNode() ;

* Simulates unix system call: *

   *   int sync(void);   *

public static void sync()

{

// write out superblock if updated // write out free list blocks if updated // write out inode blocks if updated // write out data blocks if updated

// at present, all changes to inodes, data blocks, // and free list blocks // are written as they go, so this method does nothing. }

/** * Write bytes to a file. *

* Simulates the unix system call: *

   *   int write(int fd, const void *buf, size_t count);   *

* @exception java.lang.Exception if any underlying action causes * Exception to be thrown */

public static int write( int fd , byte[] buf , int count )

throws Exception

{

// check fd int status = check_fd_for_write( fd ) ;

{

file.setSize( offset ) ;

size = offset ;

}

writeCount ++ ;

}

// write the last block if we wrote anything to it if( ( offset % blockSize ) > 0 )

{

status = file.writeBlock( (short)( ( offset – 1 ) / blockSize ) ) ;

if( status < 0 ) return status ; } // update the file size if it grew if( offset > size )

file.setSize( offset ) ;

// update the offset file.setOffset( offset ) ;

// return the count of bytes written return writeCount ;

}

/** * Writes a directory entry from a file descriptor for an * open directory. *

* Simulates the unix system call: *

   *   int readdir(unsigned int fd, struct dirent *dirp ) ;   *

public static int writedir( int fd , DirectoryEntry dirp ) throws Exception

{

// check fd int status = check_fd_for_write( fd ) ;

file.setSize( file.getOffset() ) ;

// return the size of a DirectoryEntry return DirectoryEntry.DIRECTORY_ENTRY_SIZE ;

}

private static ProcessContext process = null ;

/** * The number of processes. */

private static int processCount = 0 ;

private static int MAX_OPEN_FILES = 0 ;

private static FileDescriptor[] openFiles = null ;

// ??? should be private? public static int MAX_OPEN_FILE_SYSTEMS = 1 ;

// ??? should be private? public static FileSystem[] openFileSystems = new FileSystem[MAX_OPEN_FILE_SYSTEMS] ;

// ??? should be private? public static short ROOT_FILE_SYSTEM = 0 ;

public static void initialize()

{

if ( propertyFileName == null )

propertyFileName = “filesys.conf” ;

Properties properties = new Properties() ;

try

{

FileInputStream in = new FileInputStream( propertyFileName ) ;

properties.load( in ) ; in.close() ;

}

catch( FileNotFoundException e )

{

System.err.println( PROGRAM_NAME + “: error opening properties file” ) ;

System.exit( EXIT_FAILURE ) ;

}

catch( IOException e )

{

System.err.println( PROGRAM_NAME + “: error reading properties file” ) ;

System.exit( EXIT_FAILURE ) ;

}

// get the root file system properties String rootFileSystemFilename = properties.getProperty( “filesystem.root.filename” , “filesys.dat” ) ;

String rootFileSystemMode = properties.getProperty( “filesystem.root.mode” , “rw” ) ;

// get the current process properties short uid = 1 ;

try

{

uid = Short.parseShort( properties.getProperty( “process.uid” , “1” ) ) ;

}

catch ( NumberFormatException e )

{

System.err.println( PROGRAM_NAME + “: invalid number for property process.uid in configuration file” ) ;

System.exit( EXIT_FAILURE ) ;

}

short gid = 1 ;

try

{

gid = Short.parseShort( properties.getProperty( “process.gid” , “1” ) ) ;

}

catch ( NumberFormatException e )

{

System.err.println( PROGRAM_NAME + “: invalid number for property process.gid in configuration file” ) ;

System.exit( EXIT_FAILURE ) ;

}

short umask = 0002 ;

try

{

umask = Short.parseShort( properties.getProperty( “process.umask” , “002” ) , 8 ) ;

}

catch ( NumberFormatException e )

{

System.err.println( PROGRAM_NAME +

“: invalid number for property process.umask in configuration file” ) ;

System.exit( EXIT_FAILURE ) ;

}

String dir = “/root” ;

dir = properties.getProperty( “process.dir” , “/root” ) ;

try

{

MAX_OPEN_FILES = Integer.parseInt( properties.getProperty(

“kernel.max_open_files” , “20” ) ) ;

}

catch( NumberFormatException e )

{

System.err.println( PROGRAM_NAME + “: invalid number for property kernel.max_open_files in configuration file” ) ;

System.exit( EXIT_FAILURE );

}

try

{

ProcessContext.MAX_OPEN_FILES = Integer.parseInt( properties.getProperty( “process.max_open_files” , “10” ) ) ;

}

catch( NumberFormatException e )

{

System.err.println( PROGRAM_NAME + “: invalid number for property process.max_open_files in configuration file” ) ;

System.exit( EXIT_FAILURE );

}

// create open file array openFiles = new FileDescriptor[MAX_OPEN_FILES] ;

// create the first process process = new ProcessContext( uid , gid , dir , umask ) ;

processCount ++ ;

// open the root file system try

{

openFileSystems[ROOT_FILE_SYSTEM] = new FileSystem( rootFileSystemFilename , rootFileSystemMode ) ;

}

catch( IOException e )

{

System.err.println( PROGRAM_NAME + “: unable to open root file system” ) ;

System.exit( EXIT_FAILURE ) ;

}

/** * Failure exit status. */

private static int EXIT_FAILURE = 1 ;

/** * Success exit status. */

private static int EXIT_SUCCESS = 0 ;

public static void finalize( int status )

throws Exception

{

// exit() any remaining processes if( process != null )

exit( 0 ) ;

// flush file system blocks sync() ;

// close the root file system openFileSystems[ROOT_FILE_SYSTEM].close() ;

// terminate the program System.exit( status ) ;

}

/*Some internal methods.*/

/** * Check to see if the integer given is a valid file descriptor * index for the current process. Sets errno to EBADF if invalid. *

private static int check_fd( int fd )

{

// look for the file descriptor in the open file list if ( fd < 0 || fd >= process.openFiles.length || process.openFiles[fd] == null )

{

// return (EBADF) if file descriptor is invalid process.errno = EBADF ;

return -1 ;

}

return 0 ;

}

private static int check_fd_for_read( int fd )

{

int status = check_fd( fd ) ;

private static int check_fd_for_write( int fd )

{

int status = check_fd( fd ) ;

filesys/ls.java

/**

* A simple directory listing program for a simulated file system.

* Usage:


 *   java ls path-name ...

 *

public

class
ls

{

/**

* The name of this program.

* This is the program name that is used

* when displaying error messages.

public

static

String
PROGRAM_NAME
=

“ls”

;

/**

* Lists information about named files or directories.

*
@exception
java.lang.Exception if an exception is thrown

* by an underlying operation

public

static

void
main
(

String
[]
args
)

throws

Exception

{

// initialize the file system simulator kernel

Kernel
.
initialize
()

;

// for each path-name given

* Usage:


 *   java mkdir directory-name ...

 *

public

class
mkdir

{

/**

* The name of this program.

* This is the program name that is used

* when displaying error messages.

public

static

final

String
PROGRAM_NAME
=

“mkdir”

;

/**

* Creates the directories given as command line arguments.

*
@exception
java.lang.Exception if an exception is thrown

* by an underlying operation

public

static

void
main
(

String
[]
args
)

throws

Exception

{

// initialize the file system simulator kernel

Kernel
.
initialize
()

;

// print a helpful message if no command line arguments are given

System
.
exit
(

1

)

;

}

String
filename
=
argv
[
0
]

;

short
block_size
=

Short
.
parseShort
(
argv
[
1
]

)

;

int
blocks
=

Integer
.
parseInt
(
argv
[
2
]

)

;

int
block_total
=

0

;

blocks =

super_blocks +

free_list_blocks +

inode_blocks +

data_blocks

We need one block for the superblock.

super_blocks = 1

We need one bit in the free list map for each data block.

free_list_blocks =

( data_blocks + block_size * 8 – 1 ) /

( block_size * 8 )

??? Is this the correct number of inodes?

At worse, there will be only directory entries and empty files.

In other words, we might need as many inodes as we have blocks.

inode_blocks =

( data_blocks + block_size / inode_size – 1 ) /

( block_size / inode_size )

Then:

blocks =

super_blocks +

( data_blocks + block_size * 8 – 1 ) /

( block_size * 8 ) +

( data_blocks + block_size / inode_size – 1 ) /

( block_size / inode_size ) +

data_blocks

We then seek the maximum number of data blocks where the total number

of blocks is less than or equal to the number of blocks available.

We use a binary searching technique in the following algorithm.

int
inode_size
=

IndexNode
.
INDEX_NODE_SIZE
;

int
super_blocks
=

1

;

int
free_list_blocks
=

0

;

int
inode_blocks
=

0

;

int
data_blocks
=

0

;

int
lo
=

0

;

int
hi
=
blocks
;

hi
=
data_blocks
–

1

;

else

break

;

}

// if the last block causes free_list_blocks or inode_blocks to

// cross a block boundary, we “give” the extra space to the free

// list and/or inodes and use whatever remains for the data blocks

if
(
block_total
>
blocks
)

{

// System.out.println( “adjusting data blocks…” ) ;

// System.out.println() ;

data_blocks
—

;

}

// calculate inode and free list blocks based on the final

// count of data blocks

free_list_blocks
=

(
data_blocks
+
block_size
*

8

–

1

)

/

(
block_size
*

8

)

;

inode_blocks
=
blocks
–
super_blocks
–
free_list_blocks
–
data_blocks
;

block_total
=
super_blocks
+
free_list_blocks
+

inode_blocks
+
data_blocks
;

filesys/ProcessContext.java

/**

* A process context. This contains all information needed

* by the file system which is specific to a process.

public

class

ProcessContext

{

/**

* Number of last error.

* Simulates the unix system variable:


   *   extern int errno;

   *

public

int
errno
=

0

;

/**

* The uid for the process.

private

short
uid
=

1

;

/**

* The gid for the process.

private

short
gid
=

1

;

/**

* The working directory for the process.

private

String
dir
=

“/root”

;

/**

* The umask for the process.

private

short
umask
=

0000

;

/**

* The maximum number of files a process may have open.

public

static

int
MAX_OPEN_FILES
=

0

;

/**

* The array of file descriptors for open files.

* The integer file descriptors for kernel method calls

* are indexes into this array.

public

FileDescriptor
[]
openFiles
=

new

FileDescriptor
[
MAX_OPEN_FILES
]

;

/**

* Construct a process context. By default, uid=1, gid=1, dir=”/root”,

* and umask=0000.

public

ProcessContext
()

{

super
()

;

}

/**

* Construct a process context and specify uid, gid, dir, and umask.

public

ProcessContext
(

short
newUid
,

short
newGid
,

String
newDir
,

short
newUmask
)

{

super
()

;

uid
=
newUid
;

gid
=
newGid
;

dir
=
newDir
;

umask
=
newUmask
;

}

/**

* Set the process uid.

public

void
setUid
(

short
newUid
)

{

uid
=
newUid
;

}

/**

* Get the process uid.

public

short
getUid
()

{

return
uid
;

}

/**

* Set the process gid.

public

void
setGid
(

short
newGid
)

{

gid
=
newGid
;

}

/**

* Get the process gid.

public

short
getGid
()

{

return
gid
;

}

/**

* Set the process working directory.

public

void
setDir
(

String
newDir
)

{

dir
=
newDir
;

}

/**

* Get the process working directory.

public

String
getDir
()

{

return
dir
;

}

/**

* Set the process umask.

public

void
setUmask
(

short
newUmask
)

{

umask
=
newUmask
;

}

/**

* Get the process umask.

public

short
getUmask
()

{

return
umask
;

}

// ??? toString()

}

__MACOSX/filesys/._ProcessContext.java

filesys/Stat.java

/**

* This simulates the unix struct “stat”.

public

class

Stat

{

public

short
st_dev
=

0

;

public

short
st_ino
=

0

;

public

short
st_mode
=

0

;

public

short
st_nlink
=

0

;

public

short
st_uid
=

0

;

public

short
st_gid
=

0

;

public

short
st_rdev
=

0

;

public

int
st_size
=

0

;

public

int
st_atime
=

0

;

public

int
st_mtime
=

0

;

public

int
st_ctime
=

0

;

public

void
setDev
(

short
newDev
)

{

st_dev
=
newDev
;

}

public

short
getDev
()

{

return
st_dev
;

}

public

void
setIno
(

short
newIno
)

{

st_ino
=
newIno
;

}

public

short
getIno
()

{

return
st_ino
;

}

public

void
setMode
(

short
newMode
)

{

st_mode
=
newMode
;

}

public

short
getMode
()

{

return
st_mode
;

}

public

void
setNlink
(

short
newNlink
)

{

st_nlink
=
newNlink
;

}

public

short
getNlink
()

{

return
st_nlink
;

}

public

void
setUid
(

short
newUid
)

{

st_uid
=
newUid
;

}

public

short
getUid
()

{

return
st_uid
;

}

public

void
setGid
(

short
newGid
)

{

st_gid
=
newGid
;

}

public

short
getGid
()

{

return
st_gid
;

}

public

void
setRdev
(

short
newRdev
)

{

st_rdev
=
newRdev
;

}

public

short
getRdev
()

{

return
st_rdev
;

}

public

void
setSize
(

int
newSize
)

{

st_size
=
newSize
;

}

public

int
getSize
()

{

return
st_size
;

}

public

void
setAtime
(

int
newAtime
)

{

st_atime
=
newAtime
;

}

public

int
getAtime
()

{

return
st_atime
;

}

public

void
setMtime
(

int
newMtime
)

{

st_mtime
=
newMtime
;

}

public

int
getMtime
()

{

return
st_mtime
;

}

public

void
setCtime
(

int
newCtime
)

{

st_ctime
=
newCtime
;

}

public

int
getCtime
()

{

return
st_ctime
;

}

public

void
copyIndexNode
(

IndexNode
indexNode
)

{

st_mode
=
indexNode
.
getMode
()

;

st_nlink
=
indexNode
.
getNlink
()

;

st_uid
=
indexNode
.
getUid
()

;

st_uid
=
indexNode
.
getGid
()

;

st_size
=
indexNode
.
getSize
()

;

st_atime
=
indexNode
.
getAtime
()

;

st_mtime
=
indexNode
.
getMtime
()

;

st_ctime
=
indexNode
.
getCtime
()

;

}

__MACOSX/filesys/._Stat.java

filesys/SuperBlock.java

import
java
.
io
.
RandomAccessFile

;

import
java
.
io
.
IOException

;

import
java
.
util
.
*
;

public

class

SuperBlock

{

/**

* Size of each block in the file system.

private

short
blockSize
;

/**

* Total number of blocks in the file system.

private

int
blocks
;

/**

* Offset in blocks of the free list block region from the beginning

* of the file system.

private

int
freeListBlockOffset
;

/**

* Offset in blocks of the inode block region from the beginning

* of the file system.

private

int
inodeBlockOffset
;

/**

* Offset in blocks of the data block region from the beginning

* of the file system.

private

int
dataBlockOffset
;

/**

* Construct a SuperBlock.

public

SuperBlock
()

{

super
();

}

public

void
setBlockSize
(

short
newBlockSize
)

{

blockSize
=
newBlockSize
;

}

public

short
getBlockSize
()

{

return
blockSize
;

}

public

void
setBlocks
(

int
newBlocks
)

{

blocks
=
newBlocks
;

}

public

int
getBlocks
()

{

return
blocks
;

}

/**

* Set the freeListBlockOffset (in blocks)

*
@param
newFreeListBlockOffset the new offset in blocks

public

void
setFreeListBlockOffset
(

int
newFreeListBlockOffset
)

{

freeListBlockOffset
=
newFreeListBlockOffset
;

}

/**

* Get the free list block offset

*
@return
the free list block offset

public

int
getFreeListBlockOffset
()

{

return
freeListBlockOffset
;

}

/**

* Set the inodeBlockOffset (in blocks)

*
@param
newInodeBlockOffset the new offset in blocks

public

void
setInodeBlockOffset
(

int
newInodeBlockOffset
)

{

inodeBlockOffset
=
newInodeBlockOffset
;

}

/**

* Get the inode block offset (in blocks)

*
@return
inode block offset in blocks

public

int
getInodeBlockOffset
()

{

return
inodeBlockOffset
;

}

/**

* Set the dataBlockOffset (in blocks)

*
@param
newDataBlockOffset the new offset in blocks

public

void
setDataBlockOffset
(

int
newDataBlockOffset
)

{

dataBlockOffset
=
newDataBlockOffset
;

}

/**

* Get the dataBlockOffset (in blocks)

*
@return
the offset in blocks to the data block region

public

int
getDataBlockOffset
()

{

return
dataBlockOffset
;

}

/**

* writes this SuperBlock at the current position of the specified file.

public

void
write
(

RandomAccessFile
file
)

throws

IOException

{

file
.
writeShort
(
blockSize
)

;

file
.
writeInt
(
blocks
)

;

file
.
writeInt
(
freeListBlockOffset
)

;

file
.
writeInt
(
inodeBlockOffset
)

;

file
.
writeInt
(
dataBlockOffset
)

;

* Usage:


 *   java tee output-file

 *