Video Coming Soon...
42: uniq
The sort and uniq commands go together like peanut butter and chocolate. The uniq command ("unique") takes its input and removes any repeating lines, but it needs sorted input to do this efficiently. To do that you use sort to sort the input first like this:
sort somefile.txt | uniq
The Challenge
To confirm you implemented uniq correctly you will be required to not store more than one string for the previous and one string for the next line. If you are storing the entire input to find unique lines then you did it wrong.
You should then implement the -d option, which only prints lines that are repeated.
The Code
See my first version of uniq
View Source file uniq.cpp Only#include <fmt/core.h>
#include <iostream>
#include <unistd.h>
#include <fstream>
#include <vector>
#include <algorithm>
void uniq_file(std::istream& in) {
std::string prev;
std::string line;
// first line is always printed to start
getline(in, prev);
fmt::print("{}", prev);
while(in) {
getline(in, line);
if(line != prev) {
fmt::println("{}", line);
prev = line;
}
}
}
int main(int argc, char* argv[]) {
if(argc > 1) {
for(int i = 1; i < argc; i++) {
std::ifstream in_file{argv[i]};
uniq_file(in_file);
}
} else {
uniq_file(std::cin);
}
}
The Discussion
The reason uniq needs sort is because you can't find unique lines of text efficiently without first sorting the input. Either you have to sort the input and then skip lines as you see them, or you have to store the whole file in some kind of map or set.
Further Study
You should try to implement options for uniq and sort as you can. It's good to work on both at the same time since many options to sort go with options to uniq.
Register for Learn C++ the Hard Way
Register to gain access to additional videos which demonstrate each exercise. Videos are priced to cover the cost of hosting.