3.2 Full file conflicts

Book can be downloaded from here

So, have you enjoyed all the previous conflicts? I bet you have. Well... if you liked those, you will love this one!

3.2.1 Example 17

From OpenJDK’s JDK repo, checkout revision 9d3e0870754 and merge 2978ffb9f9d4 .

You will see a conflict pop up on file langtools/src/share/classes/com/sun/tools/doclets/internal/toolkit/resources/doclet.xml. But it is not just a conflict. It is the conflict. Instead of having a few conflicted lines, the whole file is conflicted. Here are some excerpts from it (line numbers are visible on the left side):

Listing 3.18:example 17 - conflicted file
     1  <<<<<<< HEAD 
     2  <?xml version=’1.0’ encoding=’utf-8’?> 
     3 
     4  <!-- 
     5   Copyright 2003 Sun Microsystems, Inc.  All Rights Reserved. 



   205      </SerializedForm> 
   206  </Doclet> 
   207  ||||||| 4ae52d7dc1e 
   208  <?xml version=’1.0’ encoding=’utf-8’?> 
   209 



   411      </SerializedForm> 
   412  </Doclet> 
   413  ======= 
   414  <?xml version=’1.0’ encoding=’utf-8’?> 
   415 
   416  <!-- 



   616         <Footer/> 
   617      </SerializedForm> 
   618  </Doclet> 
   619  >>>>>>> 2978ffb9f9d

Before I go into the details, let me show you what happened in the other branch in terms of changes for this file:

Listing 3.19:example 17 - changes from the other branch
$ git diff HEAD...MERGE_HEAD -- langtools/src/share/classes/com/sun/tools/doclets/internal/toolkit/resources/doclet.xml 
diff --git a/langtools/src/share/classes/com/sun/tools/doclets/internal/toolkit/resources/doclet.xml b/langtools/src/share/classes/com/sun/tools/doclets/internal/toolkit/resources/doclet.xml 
index 8eaa2d77abc..29c473790e4 100644 
--- a/langtools/src/share/classes/com/sun/tools/doclets/internal/toolkit/resources/doclet.xml 
+++ b/langtools/src/share/classes/com/sun/tools/doclets/internal/toolkit/resources/doclet.xml 
@@ -1,7 +1,7 @@ 
 <?xml version=’1.0’ encoding=’utf-8’?> 
 
 <!-- 
- Copyright 2003 Sun Microsystems, Inc.  All Rights Reserved. 
+ Copyright 2003-2009 Sun Microsystems, Inc.  All Rights Reserved.^M 
  DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. 
 
  This code is free software; you can redistribute it and/or modify it

A simple one-liner of a change. Is the line different from what we have in HEAD?

Listing 3.20:example 17 - file as it is on HEAD
$ git show HEAD:langtools/src/share/classes/com/sun/tools/doclets/internal/toolkit/resources/doclet.xml | head -n 5 
<?xml version=’1.0’ encoding=’utf-8’?> 
 
<!-- 
 Copyright 2003 Sun Microsystems, Inc.  All Rights Reserved. 
 DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.

The line looks exactly like what is removed on the other branch... this looks like a no-brainer, right? There should be no conflict, whatsoever. What happened then? There’s a short-answer to this question but.... how will I spare us the pleasure of a long-story?

Text files are made up of lines. git considers the separate lines that make up a file to see which lines are the same and which lines have changed in order to merge code. Now, have you ever asked yourself how a line is defined? Each line is separated from the next by a marker called End Of Line (or EOL, for short). But there is a tiny little problem: there are not one or two but three(!!!) different EOL markers and each one is associated with a different operating system.

Here’s a little text file with different EOL formats so you can see the difference at the binary level:

Figure 3.1:Mac EOL
PIC
Figure 3.2:NIX EOL
PIC
Figure 3.3:Windows EOL
PIC

On a new file, text editors tend to set the EOL format to the one associated with the Operating System that the editor is running on. If the file already existed, text editors5 will keep using the format that the file had when it was opened, even if it is different from the OS that the editor is running on. Go save a text file with different EOL formats and open it on a hex editor and see how the markers between the lines change.

You might be asking yourselves “How did we end up in this EOL mess?” It’s a fair question to ask and, just like any other good story, it’s about betrayals, backstabbings from close friends and greed but it’s way outside of the focus of this manual so I will kindly ask you to read Wikipedia’s Newline History if you are curious to know how this mess came to be.

Coming back to our problem at hand, from the point of view of git, a change in EOL format will effectively change the content of each and every single line that makes up the file. So, you open a preexisting file, you change a single line and then save using the wrong EOL format, and to git it’s like the whole file was cleared up and rewritten as a whole... in Esperanto. It’s an entirely different content from top to bottom. Let me say it again, just so that the concept sinks in: Completely (dramatic pause) Different (dramatic pause) File. None of the preexisting lines survived that revision.

Now, is that really the case of what happened over here? Well, let’s get the info from unix2dos to see how many line-breaks of each type there are:

Listing 3.21:example 17 - file types
$ git show HEAD:langtools/src/share/classes/com/sun/tools/doclets/internal/toolkit/resources/doclet.xml | unix2dos -imud 
       0     205       0 
$ git show MERGE_HEAD:langtools/src/share/classes/com/sun/tools/doclets/internal/toolkit/resources/doclet.xml | unix2dos -imud 
     205       0       0 
$ git show 4ae52d7dc1ef:langtools/src/share/classes/com/sun/tools/doclets/internal/toolkit/resources/doclet.xml | unix2dos -imud 
     205       0       0

And we can see how on HEAD the line breaks are on a different column, which indicates a change of EOL format.

Before considering what to do about it, let us ask ourselves “why did this happen in the first place?” To our own amazement, there are some rather simple possibilities for this to come up:

Ok, ok... enough theory. Let’s consider the different approaches you might follow to get out of this mess.

Stay on HEAD, bring over changes from the other branch

First thing to notice is that this is kind of offsetting the purpose of the merge, right? You would like to get all changes from the other branch automagically copied over on your code and get conflicts on the pieces that are rightly so. Now we are in a situation where we will need to do everything by hand. Luckily for us, in this case, we have already seen what is required to bring over from the other branch. We need to adjust the years covered in the copyright. So, this would be the resulting file, the first few lines:

Listing 3.22:example 17 - HEAD with changes from the other branch
1<<<<<<< HEAD 
2<?xml version=’1.0’ encoding=’utf-8’?> 
3 
4<!-- 
5 Copyright 2003-2009 Sun Microsystems, Inc.  All Rights Reserved. 
6 DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.

Then we remove the other parts and the conflict markers. Then wrap up the merge:

Listing 3.23:example 17 - Wrap up the merge
$ git add langtools/src/share/classes/com/sun/tools/doclets/internal/toolkit/resources/doclet.xml 
$ git merge --continue 
[detached HEAD b4a035c194c] Merge commit ’2978ffb9f9d’ into HEAD

That’s good. As as a quick fix, this does the job... but you haven’t really tackled the root cause of the problem. There is still an EOL-format discrepancy between the branches. As a little experiment, let’s checkout the revision we just asked to merge, 2978ffb9f9d, let’s write a little change on this file and then let’s come back to the merge revision we just created and ask to merge again.

Listing 3.24:example 17 - checkout 2978ffb9f9d
$ git checkout 2978ffb9f9d 
Warning: you are leaving 1 commit behind, not connected to 
any of your branches: 
 
  b4a035c194c Merge commit ’2978ffb9f9d’ into HEAD 
 
If you want to keep it by creating a new branch, this may be a good time 
to do so with: 
 
 git branch <new-branch-name> b4a035c194c 
 
HEAD is now at 2978ffb9f9d Merge

I will set the copyright to be 2003-2020, now, for the sake of the example8 :

Listing 3.25:example 17 - modified file on top of 2978ffb9f9d
1<?xml version=’1.0’ encoding=’utf-8’?> 
2 
3<!-- 
4 Copyright 2003-2020 Sun Microsystems, Inc.  All Rights Reserved. 
5 DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.

Then we finish the revision:

Listing 3.26:example 17 - creating new revision
$ git add langtools/src/share/classes/com/sun/tools/doclets/internal/toolkit/resources/doclet.xml 
$ git commit -m "A tiny change" 
[detached HEAD 0d57690bf36] A tiny change 
 1 file changed, 1 insertion(+), 1 deletion(-)

At this point, the two revisions I want to merge now look like this (look at the two revisions at the top):

Listing 3.27:example 17 - history of revisions
* 0d57690bf36 A tiny change 
| *   b4a035c194c Merge commit ’2978ffb9f9d’ into HEAD 
| |\ 
| |/ 
|/| 
* | 2978ffb9f9d Merge 
| * 9d3e0870754 Merge 
|/ 
* 4ae52d7dc1e Added tag jdk7-b50 for changeset 7faffd237305

Now we will attempt the merge:

Listing 3.28:example 17 - new merge
$ git checkout b4a035c194c 
Warning: you are leaving 1 commit behind, not connected to 
any of your branches: 
 
  0d57690bf36 A tiny change 
 
If you want to keep it by creating a new branch, this may be a good time 
to do so with: 
 
 git branch <new-branch-name> 0d57690bf36 
 
HEAD is now at b4a035c194c Merge commit ’2978ffb9f9d’ into HEAD 
$ git merge 0d57690bf36 
Auto-merging langtools/src/share/classes/com/sun/tools/doclets/internal/toolkit/resources/doclet.xml 
CONFLICT (content): Merge conflict in langtools/src/share/classes/com/sun/tools/doclets/internal/toolkit/resources/doclet.xml 
Automatic merge failed; fix conflicts and then commit the result.

Well, we got a conflict. And if you look inside, it will be again a full-file conflict:

Listing 3.29:example 17 - conflicted file... again
     1  <<<<<<< HEAD 
     2  <?xml version=’1.0’ encoding=’utf-8’?> 
     3 
     4  <!-- 
     5   Copyright 2003-2009 Sun Microsystems, Inc.  All Rights Reserved. 
     6   DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. 



   205      </SerializedForm> 
   206  </Doclet> 
   207  ||||||| 2978ffb9f9d 
   208  <?xml version=’1.0’ encoding=’utf-8’?> 
   209 
   210  <!-- 
   211   Copyright 2003-2009 Sun Microsystems, Inc.  All Rights Reserved. 
   212   DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. 



   411      </SerializedForm> 
   412  </Doclet> 
   413  ======= 
   414  <?xml version=’1.0’ encoding=’utf-8’?> 
   415 
   416  <!-- 
   417   Copyright 2003-2020 Sun Microsystems, Inc.  All Rights Reserved. 
   418   DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. 



   617      </SerializedForm> 
   618  </Doclet> 
   619  >>>>>>> 0d57690bf36

And I am kidding you not. As long as there’s different EOL-formats between the branches involved in the merge (...or rebase, ... or cherry-pick... or revert), you will get these totally useless full-file conflicts.... every.... single.... time.

Consider that, in this case, it was a very tiny change that had to be carried over from the other branch. What if the changes were bigger? Multiple pieces? A few fine changes that should get no conflict and then some conflicting ones? Are you willing to sit down to copy stuff all day long? Not the best way to spend your trained-for-tough-merges brain CPU cycles, right? Yeah, I thought so.

Given that, in our example, the EOL-format change took place in our branch, should we change the EOL format back to the original format before attempting to merge? That might work. Let’s see. First, let’s start over.

3.2.2 Example 17 - again... with a twist

From OpenJDK’s JDK repo, checkout revision 9d3e0870754. Open the file and change the EOL format to Windows9 . Then commit. Then merge 2978ffb9f9d.

Listing 3.30:example 17 - trying merge again
$ git checkout 9d3e0870754 
HEAD is now at 9d3e0870754 Merge 
$ unix2dos langtools/src/share/classes/com/sun/tools/doclets/internal/toolkit/resources/doclet.xml 
unix2dos: converting file langtools/src/share/classes/com/sun/tools/doclets/internal/toolkit/resources/doclet.xml to DOS format... 
$ git add langtools/src/share/classes/com/sun/tools/doclets/internal/toolkit/resources/doclet.xml 
$ git commit -m "Changing EOL-format" 
[detached HEAD d5bb2068164] Changing EOL-format 
 1 file changed, 205 insertions(+), 205 deletions(-) 
$ git merge 2978ffb9f9d 
Auto-merging langtools/src/share/classes/com/sun/tools/javac/util/RawDiagnosticFormatter.java 
Auto-merging langtools/src/share/classes/com/sun/tools/javac/util/BasicDiagnosticFormatter.java 
Auto-merging langtools/src/share/classes/com/sun/tools/javac/util/AbstractDiagnosticFormatter.java 
Auto-merging langtools/src/share/classes/com/sun/tools/javac/resources/compiler.properties 



 langtools/test/tools/javac/processing/model/testgetallmembers/Main.java                               | 2 +- 
 langtools/test/tools/javadoc/6176978/T6176978.java                                                    | 2 +- 
 langtools/test/tools/javadoc/6176978/X.java                                                           | 2 +- 
 83 files changed, 83 insertions(+), 83 deletions(-) 
$

And this time merge went fine. We are so good. It’s not specified in that console output, but the merge revision is cddf9316c72. So this is what we should have done in the first place, right? Get both branches involved to have the same EOL-format as the last common ancestor and then merge. Well, yes, it does work. You can merge (even better, we got no conflict this time!). But, as an additional twist, let’s checkout the original revision we started working from, 9d3e0870754, let’s modify some line from the file, a harmless change, let’s commit, let’s come back to this new merge revision we just created and let’s try to merge again, shall we? What do you think will happen?

Listing 3.31:example 17 - checkout 9d3e0870754
$ git checkout 9d3e0870754 
HEAD is now at 9d3e0870754 Merge

Listing 3.32:example 17 - modified file on top of 9d3e0870754
1<?xml version=’1.0’ encoding=’utf-8’?> 
2 
3 
4<!-- 
5 Copyright 2003 Sun Microsystems, Inc.  All Rights Reserved. 
6 DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.

I added a second empty line before the XML comment.

Listing 3.33:example 17 - wrapping up revision
$ git add langtools/src/share/classes/com/sun/tools/doclets/internal/toolkit/resources/doclet.xml 
$ git commit -m "Adding empty line" 
[detached HEAD 91df2b0b521] Adding empty line 
 1 file changed, 1 insertion(+)

What does history of branches look like now?

Listing 3.34:example 17 - current history
* 91df2b0b521 Adding empty line 
| *   cddf9316c72 Merge commit ’2978ffb9f9d’ into HEAD 
| |\ 
| | *   2978ffb9f9d Merge 
| | |\ 
| | * | 56fcf6c0524 6814575: Update copyright year 
| * | | d5bb2068164 Changing EOL-format 
|/ / / 
* | |   9d3e0870754 Merge

And now, let’s checkout cddf9316c72 and try to merge the revision we just created.

Listing 3.35:example 17 - current history
$ git checkout cddf9316c72 
Warning: you are leaving 1 commit behind, not connected to 
any of your branches: 
 
  91df2b0b521 Adding empty line 
 
If you want to keep it by creating a new branch, this may be a good time 
to do so with: 
 
 git branch <new-branch-name> 91df2b0b521 
 
HEAD is now at cddf9316c72 Merge commit ’2978ffb9f9d’ into HEAD 
$ git merge 91df2b0b521 
Auto-merging langtools/src/share/classes/com/sun/tools/doclets/internal/toolkit/resources/doclet.xml 
CONFLICT (content): Merge conflict in langtools/src/share/classes/com/sun/tools/doclets/internal/toolkit/resources/doclet.xml 
Automatic merge failed; fix conflicts and then commit the result.

And I bet by now you have a good idea of how big that conflict is, don’t you? If you don’t, here’s a tip. It starts with full and ends with -file conflict. And you can see how by merging a branch that was started before we changed back the EOL-format to what it was originally, we got a mess just as big.

Then how do we get out of this mess? The best approach would be to rewrite history so that the EOL format never happens. I know, I know... it’s not recommended as a general principle, but there are situations where it’s worth it. I would say this is one of those situations. In the recipe’s section I provide a rather-simple way to rewrite history of a branch provided that it’s straight (in other words, it has no merges), since the files had the correct EOL format.

If rewriting history is not an option then just set the EOL format to the original EOL format on the branch where it is changed and deal with the consequences. Unfortunately you are dealing with a situation that shouldn’t have happened in the fist place. If a developer changed the EOL-format of a file in a PR, it should never have been accepted. It should have been rejected, the developer should have corrected the history of the branch so that the EOL-format never happens (the branch can be corrected rather effortlessly using the recipe, ok?).

3.2.3 Exercises

Exercise 8 - using the script

From the exercises repo, checkout branch exercise8/branchA and merge exercise8/branchB. We should get a full-file conflict in primes.py.

Try to do this merge correcting history using the script from this recipe. Solution is here.

3.2.4 Tips

Copyright 2020 Edmundo Carmona Antoranz More content is coming soon.