Sample Video Frame

Created by Zed A. Shaw Updated 2024-11-11 21:25:38
 

Exercise 29: diff and patch

To finish Part IV you will simply apply the full TDD process you've been studying on a much more involved project that may be unfamiliar to you. Refer back to Exercise 28 to confirm you know the process, and make sure you follow it strictly. Create a check-list to follow if you must.

WARNING: When you are actually working, all this strict process is not very useful. Currently you are studying the process and working on internalizing it so you can use it in the real world. That's why I am being strict about how you should follow it. This is only practice, so don't become a zealot about it when you are doing real work. The purpose of the book is to teach you a set of strategies to get work done, not teach you a religious rite you can preach to the masses.

Exercise Challenge

The diff command takes two files and produces a third file (or output) that encodes what changed in the first to make the second. It's the basis of tools like git and other revision control tools. Implementing diff in Python is fairly trivial since there's a library that does it for you, so you don't need to work on the algorithms (which can be very complex).

The patch tool is the companion to the diff tool as it takes a diff file and applies it to another file to produce the third file. This lets you take changes you've made in two files, run diff to produce only the changes, then send that .diff file to someone. That person can then use their original copy of the file and your .diff with patch to rebuild your changes.

Here's an example work flow to demonstrate how diff and patch work. I have two files A.txt and B.txt. The A.txt file contains some simple text, and then I copied it and created B.txt with some modifications:

$ diff A.txt B.txt > AB.diff
$ cat AB.diff
2,4c2,4
< her fleece was white a mud
< and every where that marry
< her lamb would chew cud
---
> her fleece was white a snow
> and every where that marry went
> her lamb was sure to go

This produces a file AB.diff that has changes from A.txt to B.txt, which you can see is fixing a rhyme I broke. Once you have this AB.diff you can use patch to apply the changes:

$ patch A.txt AB.diff
$ diff A.txt B.txt

That finall command should show no output since the patch command before it effectively made A.txt have the same contents as B.txt.

Implementing these two should start with the diff command since you have a fully implemented diff using Python to cheat from. You can find it at the end of the difflib documentation but try to implement your version and see how it compares to theirs.

The real meat of this exercise is the patch tool, which Python does not implement for you. You will want to read up on the SequenceMatcher class in difflib and specifically look at the SequenceMatch.get_opcodes function. That is your only clue to making patch work, but it's a very good clue.

Previous Lesson Back to Module

Register for Learn More Python the Hard Way

Register today for the course and get the all currently available videos and lessons, plus all future modules for no extra charge.