CS 45600: Detailed information all students must know
Prerequisites
You must grasp basic algorithms, data structures, and good programming practice.
You should have substantial programming experience. Those without
such experience will have difficulty keeping up with the homework.
Proficiency in C is needed for a few homework assignments; if you have a
strong background in C++, some details will be different, but your background
should be sufficient. You must have some basic mathematics (e.g., simple
propositional calculus, elementary set theory) and must be able to prove a
theorem, including proofs by induction.
You need some Unix experience. You must understand the basics of files,
directories, creating and editing files, printing, compiling and loading
programs, and using make. You will be much, much happier if you also
can write a simple shell script (sh) and use Awk and grep
effectively.
Kernighan and Pike cover
these topics at the appropriate level.
Your programming experience should include work with dynamically allocated
data structures and with recursive functions. You should be very
comfortable writing recursive functions in the language of your
choice, as well as proving that such functions terminate.
You should have implemented some of the basic data structures and algorithms
used in computer science, like stacks, queues, lists, tables, search trees,
standard sorts (quick, insertion, merge, heap), topological sort, and graph
algorithms. These topics are well covered in other courses. Prior exposure
to exhaustive search (backtracking) will also be helpful.
Some students spend many, many hours thrashing out homework assignments. This
course uses unusual programming paradigms, and the techniques you are
accustomed to may not be much help. Although this material is not a formal
prerequisite for this course, you will be happier if you have a nodding
acquaintance with formal
methods, including the following intellectual tools:
- Loop invariants and termination conditions as they
apply to imperative programs.
- Contracts for functions,
including preconditions and postconditions.
- Termination conditions for recursive functions. When
writing recursive functions, try to develop your understanding of the deep
connections between recursion and loops; the ideas of invariants and
termination conditions apply in both cases.
- Representation invariants and abstraction functions for
abstract data types.
You can brush up on this material by looking at the
article by Bentley on the reading
list. Chapter 4 of Liskov and
Guttag has a nice tutorial on reasoning about data, which you will find
helpful in several assignments.
You should be able to write an informal mathematical proof.
For example, you should be able to prove that a sort routine returns the same
set of elements that it was passed. You should be comfortable using basic
mathematical formulas with "forall" and "exists" quantifiers, i.e., the
propositional and predicate calculi, although it will not be necessary to
write formal proofs using these calculi. You should know basic set theory,
e.g., the mathematical definition of functions as sets of ordered pairs. You
should be comfortable reading and writing formal mathematical
notation, or at least not run screaming from the room.
The most important prerequisites for this course cannot be taught. To do well
in this or any other course that involves programming, develop these habits:
- Think carefully about a problem before you begin to write code.
- When having difficulty writing code, stand up,
walk away from your computer, and think about the
difficulty. Enlist the black/white-board as your ally.
- Never be satisfied with a “working program,” but strive for
the simplest, clearest, most elegant implementation.
If you have these habits, the other prerequisites are almost
irrelevant. If you don't, you can expect to have trouble no matter
what your background.
Interactions are Expected
Engineering is not a solitary profession. To maximize your chances of success
in 456 and beyond, the class includes some interactive experiences.
-
Part of your job as a student is to get to know some of the faculty.
To help you with this part of your education, up to five minutes of
office-hour visits count as part of your course grade. Each minute you
spend in conversation during office hours will earn one percent of your
overall course grade, up to a possible total of five percent.
To earn your five percent, you must come to office hours
by the end of February. While you may find it helpful to talk
about homework, class, engineering, or Purdue overall, any mutually
agreeable topic of conversation is acceptable. Ask about the Crows!
-
Much of the programming in 456 is about your own individual
understanding of new language features and new ideas, and you will tackle
this sort of programming on your own. But sometimes you will have the
opportunity to build something more substantial. And in the real world,
substantial artifacts are seldom built by individuals working alone.
CS 456 will therefore provide you with some opportunity to practice
pair
programming. You will have the opportunity to practice pair
programming on selected problems throughout the term.
No single pair may work together on more than three
assignments. If you need help finding a partner, advertise on
Piazza.
Spontaneous interactions may be welcome or unwelcome depending on the
interaction:
- A particularly useful form of interaction is the question asked in
class. Questions are always welcome; if you have a question, chances are
other people in class have a similar question. Ask it!
- One question that is always legitimate is "why are we doing this?"
Class participation
A portion of your grade is based on your participation in class. This phrase
encompasses a variety of activities, all with the same purpose: to earn high
grades for class participation, you must show that you are actively
engaged in managing your own learning, developing new skills, and developing
new ways of programming and problem-solving. You can be engaged in a
variety of ways:
- Participating in the on-line course evaluations (both
mid-term and end-of-term)
- Asking appropriate questions in class
- Answering questions when called on in class
- Asking appropriate questions on Piazza
- Answering questions well on Piazza
- Organizing study groups
- Interacting professionally with programming partners and course staff
- Working out ideas with course staff
- Helping other students understand the material (but not doing the work
for them)
Nobody has to do all of these things; you can earn top grades for
class participation by doing just a few things well. In particular, nobody is
required to speak in class, but everybody should be prepared to answer
questions if called upon.
What questions are appropriate? Any question about programming languages.
However, it may not be appropriate to insist that every question be tracked to
its lair and dispatched.
Professional interactions with other students and with course staff are the
same as those which are expected in any workplace. It is also professional
for you to recognize that a member of the course staff may be present but not
actually available to talk about 456.
Homework is critical
In this class, you will learn most of the material on your own as you
complete the homework assignments. The importance of homework is
reflected in the weight it is assigned.
Most homework for this course involves short programming assignments. Many of
them will be based on the text
by Ramsey. There will be also be
some larger programming assignments. There will be some theory homework,
involving more proving and less programming.
As in most classes (exceptions, you know who you are), it helps to start the
homework early. But in 456, starting early will produce
unusually good benefits. Many students find if they start early,
even if they don't appear to make much progress, a solution will “come
to them” while they are doing something else.
Another reason to start early is that if you get stuck, early help is
a lot better than late help.
If you complete and understand all the homework assignments, you are almost
certain to do well on the exams and earn a high grade. If you miss
assignments or don't really understand the homework, it will be difficult for
you to earn a satisfactory grade. Extensive details
about grades are available online.
Format of homework
Your written work must carry your name, and it must be neat and well
organized. If you can, use a computer to prepare your written work.
Otherwise, write it by hand and scan it into PDF. Any work that cannot easily
be read will receive No Credit. Clear English expression is
required; grammar and spelling count. The same requirements
apply to exams.
Every assignment should include a README file that describes the
work. This description must
- Identify you by name
- Identify what aspects of the work have been correctly implemented and
what have not.
- Identify anyone with whom you have collaborated
or discussed the assignment.
- Say approximately how many hours you have spent
completing the assignment.
Extra Credit
Many homework assignments will offer opportunities to earn extra credit.
Extra credit applies to adjust final letter grades. For
example, if your grade average falls in the borderline between A- and B+, the
instructor will assign you the higher grade at my discretion if you have done
extra-credit work. Extra credit may also be mentioned in letters of
recommendation.
Extra credit is just that: extra. It is possible to earn an A
without doing any extra credit.
Readability of programming assignments
A solution to a problem is of little value if that solution cannot be
understood and modified by others. For that reason, your homework will be
graded on your explanation of what you are doing as well as your
results.
- For homeworks that involve small programming problems, most explanations
will take the form of documentation for the code you write. Each
assignment explains what documentation is expected.
- For homeworks that involve a large programming problem or that have a
substantial design component, an explanation should appear in a
README file that accompanies the assignment.
Appropriate explanations vary with the size of the problem.
- For a trivial solution of a few lines of code, it's often best to have
no explanation at all. Ideally the names of your functions and their
parameters will be all the explanation anyone needs to understand the
code.
- For a solution of a dozen lines of code, a sentence or two is usually
sufficient.
For these kinds of small problems, the best method of explanation is
comments in with the source code itself.
- For a solution of a hundred lines or more, expect several paragraphs of
explanation. Here the explanation needs to cover not so much the code
itself, but the organization of the code and the plan that produced it.
- For larger programs, especially programs that are divided into multiple
files, a couple of pages on planning, organization, and so on are
appropriate.
For these larger problems, you must describe your thinking at least
as well as your code. These kinds of long explanation should be in the README
file, not in the code itself. It is as bad to write too much
explanation for a simple solution as to write too little explanation for a
complex solution.
Good programming style and documentation are essential ingredients in a
readable assignment. Good style includes the use of functions, modules, and
abstract data types where appropriate. Select algorithms appropriate to the
task at hand, and make your code structured and easy to read. Good use and
placement of documentation is vital. Lots of comment text is usually
a sign of poor documentation. Documentation should focus on
high-level issues such as design and organization, data
representation, and data invariants. As noted above, even programs
that run perfectly will earn low grades if they are badly documented.
Documentation is too deep a topic in its own right to address comprehensively
here, but the following are a few suggestions. Good large-scale documentation
should answer such questions as:
- What algorithms are used, if any?
- Why is the implementation correct?
- Why does it terminate?
- What are the important invariants,
preconditions, and postconditions?
- What are the major data structures and their representations?
Large-scale documentation is usually not a good place to list a lot of
information about the names of functions and their arguments and other
low-level details that are most easily understood in the context of reading
the code.
Small-scale documentation (comments in the code) should say
- What is the contract of each function? (The contract of a simple
function should be evident merely from the name of the function and from
the names and types of its arguments and result. The contract of a more
complex function may require a comment.)
There's more about contracts below.
- What are the arguments of the most significant functions? For example,
is a list of edges the list of all edges in the graph or the list of
edges not yet visited?
- If any code is tricky (e.g., you had trouble getting it right the first
time), what are the local invariants? What are the pre- and
post-conditions of the procedures? What is the `recursion invariant'
(analogous to a loop invariant)?
Please don't use comments to repeat information that is readily available in
the code. Here is an example of poor style:
; (example of poor commenting style)
;
; Function: caar
; Description: Returns the car of the car of the given element.
;
(define caar (x)
(car (car x))
)
Everything in the comment is more easily learned just by looking at the code.
Here is an example of better style:
; (example of better commenting style)
;
; Visit vertex and every successor of vertex that appears in edges,
; in depth-first order. Executed for side effects on the global
; variables output and visited.
; Graph is the complete list of edges in the graph;
; we use structural recursion on edges.
(define dfsvisit (vertex edges graph)
(if (null? edges) '()
( ... (dfsvisit mumble (cdr edges) graph) ...)))
The comment explains what the function is doing, notes that there is no return
value but there are side effects, and explains what the parameters
mean. The comment also explains why the function terminates (the key
phrase is “structural recursion”). This comment should be
supported by comments on the global variables
output
and visited
, saying what they
represent and what the representation invariants are.
Here are two examples of very poor commenting style:
++i; /* Use the prefix autoincrement operator to increase i by one. */
for (;;) { /* Infinite loop. */
You may write documentation using appropriate comments and a README file, or
you may wish to use the noweb
tool for ``literate
programming.'' Literate programming provides a way to mix code and
documentation freely, then extract the code automatically. If you use a
literate-programming tool, you can combine everything you submit into a single
source file and dispense with a separate README file. This is how the course
code and textbook have been prepared.
Requirements for coding style
We emphasize readability and clarity. Work that we find obscure or difficult
to read will earn lower grades. Other than that, we do not require any
particular coding style. However, whatever coding style you choose must meet
these constraints:
- Code you submit should be accepted by the appropriate compiler
without errors or warnings.
- Use the same number of spaces for indentation that we use in any code
we distribute to you. In the case of C, this means that you should use
4-space indents.
- Your code should not wrap when displayed in 80 columns.
and it should not contain TAB characters.
(Why
this stuff matters
and how to deal with
it.)
- Your ML code should not have
redundant
parentheses, as described in the relevant section of the
ML supplement.
- All submitted code must obey the offside rule, as
explained below.
The offside rule
The offside rule comes into play when a definition, declaration, expression,
or statement (which I'll call a construct) has to be split across
multiple lines of code. Roughly speaking, it says that when you split a
construct, later lines have to be indented with respect to the beginning of
the construct. They may never be "outdented" to the left. The rule is based
on the start of the construct, not the start of the line.
Here's the rule given more precisely:
Code respects the offside rule if and only if whenever the first token of a
construct appears in column i, every other token appears in a
column greater than i.
Using Lisp syntax, the rule is very easy to interpret: any token enclosed in
parentheses must appear to the right of the opening parenthesis. Here's an
example of code that violates the offside rule:
(define gcd (m n)
(begin (while (!= (set r (mod m n)) 0)
(begin
(set m n)
(set n r)))
n))
The problem is with the second begin: it is part of the body
of the while loop, so it should appear to the right of the
while.
Here is the same function formatted in a way that respects the offside
rule:
(define gcd (m n)
(begin
(while (!= (set r (mod m n)) 0)
(begin
(set m n)
(set n r)))
n))
And here is the some function formatted more compactly, though perhaps more
idiosyncratically, but it does respect the offside rule:
(define gcd (m n) (begin
(while (!= (set r (mod m n)) 0)
(begin (set m n) (set n r)))
n))
Here is a C declaration that violates the offside rule:
static Valuelist evallist(Explist el, Valenv globals, Funenv functions, Valenv
formals);
The problem is in the declaration of the last parameter:
formals appears to the left of the first token,
Valenv. This code was created by a computer program that squeezes
wide code into 80 columns but may violate the offside rule. A more clever
program, or a human being, might format the code like this:
static Valuelist evallist(Explist el, Valenv globals, Funenv functions,
Valenv formals);
which respects the offside rule.
What is a contract?
A contract is a form of documentation for functions. A
function's contract relates the values of variables and other machine state at
exactly two points in time:
- The point at which the function is called
- The point at which the function returns
Functions that do not refer to global variables or to mutable state on the
heap have exceptionally simple contracts. (This is one reason people like
them.)
As an example, a function to compute the greatest common denominator of
two natural-number arguments m and n might
say simply:
Returns the largest integer that evenly divides both m
and n.
Such a contract is so simple that putting this contract in a
comment would be considered bad style, because the contract
should be evident from the function's name.
Another example of a contract would be for a sort function:
Takes a list of integers ns and returns a new list containing
the same elements as the argument list, but in nondecreasing order.
Here, some documentation of the contract is necessary, because the
name of the sort function alone does not tell you what the
sort order is.
Here is an example of documentation that is not a contract:
The function walks through the tree and at each step checks to see if the
node contains a prime number. If not, it checks the left subtree and the
right subtree for prime numbers.
There are some signs that something is very wrong:
- The documentation says nothing about what the function returns.
- The documentation contains a narrative description of
an algorithm, i.e., a computation that evolves over multiple
points in time.
- The algorithm being described is almost impossible to write a contract
for. Probably the author intended to implement a different algorithm.
Here is a good contract for a related algorithm:
Takes as argument a binary-search tree full of integers, and returns the
smallest prime number in that tree, or if the tree contains no prime number,
returns -1.
Collaboration
Programming is a creative process. Individuals must reach their own
understanding of problems and discover paths to their solutions. During this
time, discussions with friends and colleagues are encouraged—you
will do much better in the course, and in your studies, if you find people
with whom you regularly discuss problems. But those discussions
should take place in English, not in code. If you start communicating in
code, you've broken the rules.
When you reach the coding stage, therefore, group discussions are no
longer appropriate.
Each program, unless explicitly assigned as a pair problem,
must be entirely your own work.
Do not, under any circumstances, permit any other student to see any
part of your program, and do not permit yourself to see any part of another
student's program. In particular, you may not test or debug another
student's code, nor may you have another student test or debug your code. (If
you can't get code to work, consult a teaching assistent or the instructor.)
Using another's code in any form or writing code for use by another
violates the University's academic regulations.
Do not, under any circumstances, post a public question to Piazza that
contains any part of your code. Private questions directed to the
instructors are OK.
Suspected violations will be reported to the Office of the Dean of Students
for investigation and adjudication. Be careful! Penalties for violation can
be severe. A single bad decision made in a moment of weakness could lead to a
permanent blot on your academic record.
The same standards apply to all homework assignments; work you submit under
your name must be entirely your own work. Always acknowledge those
with whom you discuss problems!
Use of the library
You may look in the library (including the Internet, etc.) for ideas on how to
solve homework problems, just as you may discuss problems with your
classmates. Library sources must be acknowledged when you
submit your homework, even if you find little or nothing useful.
Some students rely heavily on the library. Although this is permitted within
the letter of the rules, it is better to grapple without heavy reliance on the
library. Doing homework is the best way for you to learn.
While library skills are important in our profession, the homework in this
course develops other skills that are even more important. Remember, you will
not have the library with you when you write your exams or go on job
interviews!
Wikipedia considered harmful
Wikipedia merits special warning. Wikipedia is a terrible
source of information on programming languages. Many of the entries
are just plain wrong, and Wikipedia's rules make it nearly
impossible for experts to correct bad articles.
Late policy
Homework that is submitted electronically (most homework) will typically be
due at 11:59 pm during the week or at 5:59 pm on a
Friday. An automatic extension of ten minutes is available at no cost to you.
If you plan on submitting your work at midnight or at six, you will have nine
minutes for last-minute changes.
Homework is expected to be submitted on time. However, we recognize that the
college life occasionally interferes with on-time submission. If you have
difficulty getting homework in on time, you have two options:
- For ordinary difficulties, each student is automatically issued seven
(7) extension tokens. By spending an extension token, you can
get an automatic 24-hour extension on all deadlines associated
with a single assignment.
Expenditure of extension tokens is governed by these rules:
- When expending an extension token, you must notify the course
staff by email. We must receive this
notification before the assignment is due. (Notification
ensures that we don't post solutions before all homework is in.)
- At most ONE extension token may be expended on any single
assignment.
- When you are out of tokens, late homework will no longer be
accepted. It will be returned ungraded, and you will receive
No Credit for the work.
- If a serious illness affects your ability to complete homework on time,
your first step is to report the illness to us. We will make
suitable arrangements.
- For extraordinary difficulties, such as bereavement, family emergencies,
or other extraordinary unpleasant events, your first step should be to
make contact with us. You must take this step before the
assignment is due. We will make special arrangements that are
suited to your circumstances.
Solutions to homeworks will not be released until the last
assignment is turned in
or until the 24-hour deadline has passed.
Students turning homework in on time may have solutions sent to
them by sending a request to the course staff.
Regrading
If we have made a mistake in grading a homework assignment, you have seven
days after the return of the assignment to call the mistake to the
attention of your TA or instructor. In such cases, we will reassess
the entire assignment and assign a new grade. The new grade may be
higher or lower than the original grade.
Computer software
The class will be run using Linux, as installed on the departmental servers
and in the laboratories in Lawson. For remote access
use data.cs.tufts.edu or sslabXX.cs.purdue.edu (for some
value of XX starting with 01. The software from the book
will be installed on these machines, but you can also grab the software and
compile it yourself; try
git clone data.cs.purdue.edu:/homes/cs456/book-code
Stupid software tricks
The Linux servers have the wonderful rlwrap program, which is
extremely handy for interacting with our interpreters. Try typing,
e.g., rlwrap impcore, and you will be able to get an interactive
editing loop with the impcore interpreter.