Sample Video Frame
Exercise 29: diff and patch
To finish Part IV you will simply apply the full TDD process you've been studying on a much more involved project that may be unfamiliar to you. Refer back to Exercise 28 to confirm you know the process, and make sure you follow it strictly. Create a check-list to follow if you must.
WARNING: When you are actually working, all this strict process is not very useful. Currently you are studying the process and working on internalizing it so you can use it in the real world. That's why I am being strict about how you should follow it. This is only practice, so don't become a zealot about it when you are doing real work. The purpose of the book is to teach you a set of strategies to get work done, not teach you a religious rite you can preach to the masses.
Exercise Challenge
The diff
command takes two files and produces a third file (or output) that encodes what changed in the first to make the second. It's the basis of tools like git
and other revision control tools. Implementing diff
in Python is fairly trivial since there's a library that does it for you, so you don't need to work on the algorithms (which can be very complex).
The patch
tool is the companion to the diff
tool as it takes a diff file and applies it to another file to produce the third file. This lets you take changes you've made in two files, run diff
to produce only the changes, then send that .diff file to someone. That person can then use their original copy of the file and your .diff with patch
to rebuild your changes.
Here's an example work flow to demonstrate how diff
and patch
work. I have two files A.txt
and B.txt
. The A.txt
file contains some simple text, and then I copied it and created B.txt
with some modifications:
$ diff A.txt B.txt > AB.diff
$ cat AB.diff
2,4c2,4
< her fleece was white a mud
< and every where that marry
< her lamb would chew cud
---
> her fleece was white a snow
> and every where that marry went
> her lamb was sure to go
This produces a file AB.diff
that has changes from A.txt
to B.txt
, which you can see is fixing a rhyme I broke. Once you have this AB.diff
you can use patch
to apply the changes:
$ patch A.txt AB.diff
$ diff A.txt B.txt
That finall command should show no output since the patch
command before it effectively made A.txt
have the same contents as B.txt
.
Implementing these two should start with the diff
command since you have a fully implemented diff
using Python to cheat from. You can find it at the end of the difflib documentation but try to implement your version and see how it compares to theirs.
The real meat of this exercise is the patch
tool, which Python does not implement for you. You will want to read up on the SequenceMatcher
class in difflib
and specifically look at the SequenceMatch.get_opcodes function. That is your only clue to making patch
work, but it's a very good clue.
Register for Learn More Python the Hard Way
Register today for the course and get the all currently available videos and lessons, plus all future modules for no extra charge.