1.6 Deleted code can be a challenge

Book can be downloaded from here

When you find cases where code has been deleted, it might be tempting to assume that the code was deleted and there’s nothing else to think about. Unfortunately, that is not always the case. On many occasions the code was not actually deleted but moved instead.

1.6.1 Example 6

Listing 1.92:example 6
1#!/usr/bin/python 
2 
3import sys 
4 
5colors = {"black": "black mirror", 
6          "white": "white noise", 
7          "blue": "blue sky"} 
8 
9def getPhrase(origColor): 
10    color = origColor.lower() 
11    if color not in colors: 
12        sys.stderr.write("There is no phrase defined for color %s\n" % origColor) 
13        sys.exit(1) 
14    phrase = colors[color] 
15<<<<<<< HEAD 
16||||||| 3c7f401 
17    phrase = "%s: %s" % (origColor, phrase) 
18======= 
19    phrase = "Color: %s\nPhrase: %s" % (origColor, phrase) 
20>>>>>>> example7/branchB 
21    return phrase 
22 
23print(getPhrase(sys.argv[1]))

dMU

The line where the phrase is modified to include the original color in the output (line 17 originally) was removed.

dML

Formatting for the phrase was modified (from line 17 to line 19). Under normal conditions, the way a line has been modified on one branch does not matter if on the other branch the line has been deleted.

Resolution

If the intent of one branch was to delete it, then there is probably no point in a modification made on it, right? So:

Listing 1.93:example 6 - resolution
1#!/usr/bin/python 
2 
3import sys 
4 
5colors = {"black": "black mirror", 
6          "white": "white noise", 
7          "blue": "blue sky"} 
8 
9def getPhrase(origColor): 
10    color = origColor.lower() 
11    if color not in colors: 
12        sys.stderr.write("There is no phrase defined for color %s\n" % origColor) 
13        sys.exit(1) 
14    phrase = colors[color] 
15    return phrase 
16 
17print(getPhrase(sys.argv[1]))

In this case it’s a clear shot but more often than not analysis actually requires more brain power to get a correct informed resolution. Take this sample:

1.6.2 Example 7

Listing 1.94:example 7
1#!/usr/bin/python 
2 
3import sys 
4 
5colors = {"black": "black mirror", 
6          "white": "white noise", 
7          "blue": "blue sky"} 
8 
9def getPhrase(color): 
10<<<<<<< HEAD 
11||||||| f75877d 
12    if color not in colors: 
13        sys.stderr.write("Got no phrase for color %s\n" % color) 
14        sys.exit(1) 
15======= 
16    if color not in colors: 
17        sys.stderr.write("Phrase for color %s is not defined\n" % color) 
18        sys.exit(1) 
19>>>>>>> example8/branchB 
20    phrase = getFormattedPhrase(color) 
21    return phrase 
22 
23def getFormattedPhrase(aColor): 
24    return "%s: %s" % (aColor, colors[aColor]) 
25 
26print(getPhrase(sys.argv[1]))

dMU

The conditional to check if the color is defined (defined in lines 12-14) was removed.

dML

The error message when the color is not defined was modified (from line 13 to line 17).

Resolution

Just like we did in our previous example, we get rid of the conflicting lines:

Listing 1.95:example 7 - resolution
1#!/usr/bin/python 
2 
3import sys 
4 
5colors = {"black": "black mirror", 
6          "white":  "white noise", 
7          "blue": "blue sky"} 
8 
9def getPhrase(color): 
10    phrase = getFormattedPhrase(color) 
11    return phrase 
12 
13def getFormattedPhrase(aColor): 
14    return "%s: %s" % (aColor, colors[aColor]) 
15 
16print(getPhrase(sys.argv[1]))

We are doing great! But there was not much brain power we needed to spend, right? Where’s the trick? Take this next example:

1.6.3 Example 8

Listing 1.96:example 8
1#!/usr/bin/python 
2 
3import sys 
4 
5colors = {"black": "black mirror", 
6          "white": "white noise", 
7          "blue": "blue sky"} 
8 
9def getPhrase(color): 
10<<<<<<< HEAD 
11||||||| 305fd0a 
12    if color not in colors: 
13        sys.stderr.write("Got no phrase for color %s\n" % color) 
14        sys.exit(1) 
15======= 
16    if color not in colors: 
17        sys.stderr.write("Phrase for color %s is not defined\n" % color) 
18        sys.exit(1) 
19>>>>>>> example9/branchB 
20    phrase = getFormattedPhrase(color) 
21    return phrase 
22 
23def getFormattedPhrase(aColor): 
24    if aColor not in colors: 
25        sys.stderr.write("Got no phrase for color %s\n" % aColor) 
26        sys.exit(1) 
27    return "%s: %s" % (aColor, colors[aColor]) 
28 
29print(getPhrase(sys.argv[1]))

This looks very much like example 7.

dMU

The check for the color to be defined was removed.

dML

The error message when color is not defined was modified we modified.

Resolution

We should get rid of the code then, shouldn’t we? Just like we did in Well... not exactly. If you look at the definition of getFormattedPhrase(), on lines 23-27 you have the exact same error message that was on MB.... except for the name of the variable for color. Look at MB:

Listing 1.97:example 8 - MB
11||||||| 305fd0a 
12    if color not in colors: 
13        sys.stderr.write("Got no phrase for color %s\n" % color) 
14        sys.exit(1) 
15=======

Look at current definition of getFormattedPhrase(), outside of the CB:

Listing 1.98:example 8 - getFormattedPhrase()
23def getFormattedPhrase(aColor): 
24    if aColor not in colors: 
25        sys.stderr.write("Got no phrase for color %s\n" % aColor) 
26        sys.exit(1) 
27    return "%s: %s" % (aColor, colors[aColor])

And this hints us that the if block is not really gone. Actually it was moved as part of a refactoring effort 32 . In this case what we need to do is adjust the wording of the code in getFormattedPhrase() as it was modified in dML. So the correct conflict resolution should look like this:

Listing 1.99:example 8 - resolution
1#!/usr/bin/python 
2 
3import sys 
4 
5colors = {"black": "black mirror", 
6          "white": "white noise", 
7          "blue": "blue sky"} 
8 
9def getPhrase(color): 
10    phrase = getFormattedPhrase(color) 
11    return phrase 
12 
13def getFormattedPhrase(aColor): 
14    if aColor not in colors: 
15        sys.stderr.write("Phrase for color %s is not defined\n" % aColor) 
16        sys.exit(1) 
17    return "%s: %s" % (aColor, colors[aColor]) 
18 
19print(getPhrase(sys.argv[1]))

Careful with the name of the variable!!!

Don’t think that a conflict is restricted to the conflicting section of code. Other parts of the code might need adjustments. Nothing is off-limits to solve conflicts. Also consider that in this case it was a pretty small file where it is easy to spot what happened with the code. But consider what the situation would be like if the file were a few hundred lines long and you are not able to easily see that the lines were moved to a different place. Or, even more difficult, if the function were defined in a separate module/file. Then you might have a much harder time figuring out what happened and what to do..... just like:

1.6.4 Example 9

Listing 1.100:example 9
1#!/usr/bin/python 
2 
3from module import getFormattedPhrase 
4import sys 
5 
6def getPhrase(color): 
7<<<<<<< HEAD 
8||||||| f78479a 
9    if color not in colors: 
10        sys.stderr.write("Got no phrase for color %s\n" % color) 
11        sys.exit(1) 
12======= 
13    if color not in colors: 
14        sys.stderr.write("Color %s has no phrase\n" % color) 
15        sys.exit(1) 
16>>>>>>> example10/branchB 
17    phrase = getFormattedPhrase(color) 
18    return phrase 
19 
20print(getPhrase(sys.argv[1]))

This looks like example 7 where we only needed to delete the code and be happy with it. But if you looked inside module.py, you would see the same block of code that we saw moving on example 8:

Listing 1.101:example 9 - module.py
1import sys 
2 
3colors = {"black": "black mirror", 
4          "white": "white noise", 
5          "blue": "blue sky"} 
6 
7def getFormattedPhrase(aColor): 
8    if aColor not in colors: 
9        sys.stderr.write("Got no phrase for color %s\n" % aColor) 
10        sys.exit(1) 
11    return "%s: %s" % (aColor, colors[aColor])

In this case resolving the conflict would require adjusting the error message in module.py following the new phrasing of LB from our CB from example.py, adjusting variable name, just like we did in example 8:

Listing 1.102:example 9 - module.py final
1import sys 
2 
3colors = {"black": "black mirror", 
4          "white": "white noise", 
5          "blue": "blue sky"} 
6 
7def getFormattedPhrase(aColor): 
8    if aColor not in colors: 
9        sys.stderr.write("Color %s has no phrase\n" % aColor) 
10        sys.exit(1) 
11    return "%s: %s" % (aColor, colors[aColor])

And removing the CB from example.py:

Listing 1.103:example 9 - example.py final
1#!/usr/bin/python 
2 
3from module import getFormattedPhrase 
4import sys 
5 
6def getPhrase(color): 
7    phrase = getFormattedPhrase(color) 
8    return phrase 
9 
10print(getPhrase(sys.argv[1]))

1.6.5 How can we avoid... guessing?

First, do not assume that this is a 1-step process. Hunting down for when a deletion happened to make sure if code was moved or not normally involves pulling up your sleeves and getting dirty in history. So, without further ado, let’s try with a real project:

1.6.6 Example 10 - a git example for deleted code

From git repo, let’s checkout revision 0da63da794 and merge revision f1928f04b2 33. There will be a CB in builtin/grep.c.

Listing 1.104:example 10 - cb in builtin/grep.c
1123<<<<<<< HEAD 
1124        if (recurse_submodules && untracked) 
1125                die(_("--untracked not supported with --recurse-submodules")); 
1126 
1127||||||| d0654dc308 
1128        if (recurse_submodules && (!use_index || untracked)) 
1129                die(_("option not supported with --recurse-submodules")); 
1130 
1131======= 
1132>>>>>>> f1928f04b2 
1133        if (!show_in_pager && !opt.status_only) 
1134                setup_pager();

And you can see that it involves deleted code.

dMU

Modified the conditional and modified the wording of the message printed on die() on line 1125.

dML

The whole block wad deleted.

Resolution

Should we remove the block just like that? Let’s try to fund out when the block was deleted on the other branch.

Listing 1.105:example 10 - git blame –reverse
$ git blame -s --reverse d0654dc308..f1928f04b2 -- builtin/grep.c 



d7992421e1a 1118)       if (recurse_submodules && (!use_index || untracked)) 
d7992421e1a 1119)               die(_("option not supported with --recurse-submodules")); 
d7992421e1a 1120) 
f1928f04b25 1121)       if (!show_in_pager && !opt.status_only) 
f1928f04b25 1122)               setup_pager();

git blame –reverse allows us to see when was the last revision when code was present in the history. The two revisions that I used were the common ancestor (provided in the CB), where we know that the line was still present, and the revision we are trying to merge aka the other branch.

And we can see that the last revision when the two lines we are tracking were present is in d7992421e1a, in the direction of the other branch. Next step would see what revision followed it to know in what revision it was actually deleted:

Listing 1.106:example 10 - git log –oneline
$ git log --graph --oneline d7992421e1a..f1928f04b2 -- builtin/grep.c 
* f1928f04b2 grep: use no. of cores as the default no. of threads 
* 70a9fef240 grep: move driver pre-load out of critical section 
* 1184a95ea2 grep: re-enable threads in non-worktree case 
* 6c307626f1 grep: protect packed_git [re-]initialization 
* c441ea4edc grep: allow submodule functions to run in parallel

And by checking this handful of revisions, we realize that the lines were moved, not really deleted, in c441ea4edc:

Listing 1.107:example 10 - git show c441ea4edc
$ git show c441ea4edc 



diff --git a/builtin/grep.c b/builtin/grep.c 
index d3ed05c1da..ac3d86c2e5 100644 
--- a/builtin/grep.c 
+++ b/builtin/grep.c 



@@ -1052,6 +1050,9 @@ int cmd_grep(int argc, const char **argv, const char *prefix) 
        pathspec.recursive = 1; 
        pathspec.recurse_submodules = !!recurse_submodules; 
 
+       if (recurse_submodules && (!use_index || untracked)) 
+               die(_("option not supported with --recurse-submodules")); 

        if (list.nr || cached || show_in_pager) { 
                if (num_threads > 1) 
                        warning(_("invalid option combination, ignoring --threads")); 



@@ -1105,9 +1114,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix) 
                } 
        } 
 
-       if (recurse_submodules && (!use_index || untracked)) 
-               die(_("option not supported with --recurse-submodules")); 

        if (!show_in_pager && !opt.status_only) 
                setup_pager();

Are the lines there on the conflicted file currently on the working tree? They are, though a little displaced:

Listing 1.108:example 10 - another section of builtin/grep.c
1054        pathspec.recurse_submodules = !!recurse_submodules; 
1055 
1056        if (recurse_submodules && (!use_index || untracked)) 
1057                die(_("option not supported with --recurse-submodules")); 
1058 
1059        if (show_in_pager) {

Let’s bring over the changes as they were introduced in UB and we should be fine:

Listing 1.109:example 10 - adjust section of builtin/grep.c
1054        pathspec.recurse_submodules = !!recurse_submodules; 
1055 
1056        if (recurse_submodules && untracked) 
1057                die(_("--untracked not supported with --recurse-submodules")); 
1058 
1059        if (show_in_pager) {

Get rid of the CB altogether:

Listing 1.110:example 10 - remove CB in builtin/grep.c
1120                } 
1121        } 
1122 
1123        if (!show_in_pager && !opt.status_only) 
1124                setup_pager();

And if we compare with revision 56ceb64eb0, there should be no differences.

1.6.7 Do not be impressed by big conflicts

Deleted code can produce big conflicts that are really not that big when you look more closely. When there’s deleted code on one branch and as much as a single line of the code that got deleted is touched on another, git can’t do much more than show you the original code and code as it ended up on the other branch if you are using diff3, as you should always do. If the block was 5 lines in the beginning and 6 in the end, well, that’s 11 lines plus markers... not too big. But if the original block was 300 lines and the final block is 300 with a single line modified, then we are talking about 300 * 2 + markers. And all of that because of a single line that is modified. The examples we attempted so far related to deleted code haven’t been that big... so let’s try one with a few more lines.

1.6.8 Example 11

From git repo, checkout revision d88949d86e and merge d9f62dfa0d34 . Take a look at the CB in ref-filter.c. The CB starts on line 1680, ends up on line 1959 soI won’t be copying the content into the book because of the sheer size. But let’s not allow ourselved to be stopped from doing the analysis and resolution:

dMU

The whole section was deleted

dML

I’ll give you 5 minutes to sort it out on your own, then I will tell you what the difference is. Stop reading, Take a look at the code and come back in 5 minutes! No cheating!

[5 minutes go by]

So... did you get it? Switching !oidcpm() for oideq(), right? Was it tricky to catch? I bet it was.... but I can’t really tell because I actually didn’t try to do it visually. I asked git to give me a little help to figue it out:

Listing 1.111:example 11 - git diff
1054$ git diff HEAD...MERGE_HEAD -- ref-filter.c 
1055diff --git a/ref-filter.c b/ref-filter.c 
1056index 0bccfceff2..ccca317ce1 100644 
1057--- a/ref-filter.c 
1058+++ b/ref-filter.c 
1059@@ -1710,7 +1710,7 @@ struct contains_stack { 
1060 static int in_commit_list(const struct commit_list *want, struct commit *c) 
1061 { 
1062        for (; want; want = want->next) 
1063-               if (!oidcmp(&want->item->object.oid, &c->object.oid)) 
1064+               if (oideq(&want->item->object.oid, &c->object.oid)) 
1065                        return 1; 
1066        return 0; 
1067 }

Git is an amazing analysis tool. Learn to squiz as much as you can from it.

Coming back to our conflict... See? All of those 279 lines of an apparenty nasty conflict were produced by this tiny litle change (and the deletion in dMU, of course). Now that we know what it was all about, can you solve this conflict on your own, right? Give it a try. Here is the full list of modifications for you to check when you finish:

After finding out that the code we were talking about was moved to a separate file:

Listing 1.112:example 11 - commit-reach.c
426static int in_commit_list(const struct commit_list *want, struct commit *c) 
427
428        for (; want; want = want->next) 
429                if (oideq(&want->item->object.oid, &c->object.oid)) 
430                        return 1; 
431        return 0; 
432}

Removing the CB from ref-filter.c:

Listing 1.113:example 11 - ref-filter.c
1679/* 
1680 * Return 1 if the refname matches one of the patterns, otherwise 0. 
1681 * A pattern can be a literal prefix (e.g. a refname "refs/heads/master" 
1682 * matches a pattern "refs/heads/mas") or a wildcard (e.g. the same ref 
1683 * matches "refs/heads/mas*", too). 
1684 */ 
1685static int match_pattern(const struct ref_filter *filter, const char *refname) 
1686{

And there’s a conflict on this file also:

Listing 1.114:example 11 - builtin/log.c
1763        if (ignore_if_in_upstream) { 
1764                /* Don’t say anything if head and upstream are the same. */ 
1765                if (rev.pending.nr == 2) { 
1766                        struct object_array_entry *o = rev.pending.objects; 
1767                        if (oideq(&o[0].item->oid, &o[1].item->oid)) 
1768                                goto done; 
1769                } 
1770                get_patch_ids(&rev, &ids); 
1771        }

1.6.9 Bottom line

Think of conflicts related to deleted code as an iceberg. You only get to see the tip of the conflict, but it can be an inverted mountain of ice beneath the surface bigger than Olympus. 35

1.6.10 Exercises

Exercise 6

From git repo, checkout revision 0194c9ad72 and merge revision aa46a0da30. Solution is here. Copyright 2020 Edmundo Carmona Antoranz