The website TeachYourselfCS.com was recently
featured on Hacker News, and after reading through I’ve decided to embark on a
new learning journey by going through one of its subjects as properly as I can.
The subject I’ve chosen is Algorithms and Data
Structures, as I think that will be
most interesting to me personally, and will have the most immediate practical
application in the kinds of things I like to do.
I’m very grateful that the recommended lecture series, Skiena’s Algorithms
Lectures, is freely
available online (YouTube copy), as are course
documents for that course and
various others. I was also able to find a copy of The Algorithm Design Manual
with a quick search.
I may be making some notes for myself here as I work through the course. That
was the main motivation behind
EastAsiaStudent.net and a lot of the posts here,
as it can be a good way to focus, and also makes an easy reference for me to go
back over later.
Lecture 1: Introduction to algorithms
Lecture video
(YouTube), lecture notes.
Sorting
(I’m just making these initial notes in order to play around with
MathJAX.)
Sorting is a classic example of an algorithmic problem, and can be defined as:
Input: A sequence of $N$ numbers $a_1...a_n$
Output: The permutation of the sequence such that $a_1 \le a_2 ... \le a_n$
Robot tour
Another example of an algorithmic problem. Given a set of pins arranged on a
circuit board, find the shortest path for a robot arm to connect all the pins.
Someone suggests Dijkstra’s
algorithm, but this is not
suitable as it’s about finding the shortest path between two points, and not the
shortest tour across several points.
Brute force
A possible correct algorithm is brute force: determine all potential tours and
select the one with the shortest total distance:
$d = \infty$
For each of the $n!$ permutations $\Pi_i$ of the $n$ points:
If $(cost(\Pi_i) \le d)$ then
$d = cost(\Pi_i)$ and $P_{min} = \Pi_i$
Return $P_{min}$
This would take forever, though: e.g. $20! = 2432902008176640000$ permutations
to check.
Nearest neighbour tour
Another idea is to pick an arbitrary starting point, and keep moving to the
nearest untouched neighbour:
$p = p_0$
$i = 0$
While there are still unvisited points:
$i = i+1$
$p_i = closest\_unvisited\_neighbour(p)$
Visit $p_i$
Go back to $p_0$ from $p_i$
This produces a tour but there are simple examples where it produces wasteful
tours.
The main point is that it is a greedy algorithm, i.e. one that just looks
for the next best step from where it is. In general these are poor solutions if
there is no proof. The guideline is: look for examples that make it wrong.
Conclusion to this problem is that there is no correct and efficient algorithm
for it (it’s the well known travelling salesman
problem). It is NP
complete.
Fast algorithms on slow computers
Point that fast algorithms always tend to outperform slow ones as the size
increases, regardless of the hardware.
Maximum scheduling problem
How can we pick from a selection of overlapping scheduled slots to maximise the
time spent (e.g. from different jobs and we want to maximise the number of jobs
we take)?
Input: a set $I$ of $n$ intervals on a line
Output: the largest subset of non-overlapping intervals from $I$
One possible strategy is to keep selecting the interval that interferes with the
least others; produces a correct solution for the example given. Can we find an
example that makes it fail?
Ties are interesting when they’re not the final choice, as you need to know how
picking either branch will affect subsequent choices down the chain. This is
actually why this approach is a heuristic and not a correct algorithm.
Again, focusing on examples that break it is how we can discover this.
First job to complete
Always take the job that ends first:
While $(I \neq \theta)$
Take job $j$ with earliest completion date
(Remove any jobs conflicting with $j$)
This is the optimal solution, and works in either direction (can go right-to-
left selecting earliest start dates).
View post:
Getting started with Teach Yourself CS: Algorithms and Data Structures
|