1668: Diffs not applying cleanly circa 1.0.7 with odd line endings
- WontFix
- Review Board
AaronJ*******@gmai***** (Google Code) (Is this you? Claim this profile.) | |
|
|
July 18, 2011 |
What version are you running? 1.0.7 What's the URL of the page containing the problem? N/A What steps will reproduce the problem? 1. Create a file with mixed DOS and Unix line endings 2. post-review 3. view diffs What is the expected output? What do you see instead? The diff page produces an error such as "Error: The patch to '...' didn't apply cleanly." What operating system are you using? What browser? Linux. Any browser. Please provide any additional information below. In one case, I found that a user had a file containing line endings that were solely carriage returns. In one I found the file contained mixed CR/LF and LF. While I realize that this is not ideal, at least providing the user with some feedback on what went wrong would help them solve their own problems (e.g. changing the line endings or marking the file as binary).
We have a lot of logic in place to handle mixed line endings. I'm sad to see that something is breaking. I need an example diff that demonstrates this problem, preferably one that applies against the Review Board repository so that we can test it.
-
+ NeedInfo
I'll see what I can do tonight. I didn't have anything clean that I could hand off outside of the company when I filed this, but I think I should be able to create something trivially enough. One file that triggered it simply had: a b^M ^M c d So reproducing it shouldn't take long.
Ah, OK, here's the easiest way to reproduce. This worked on a file we had checked in: Open an *existing* file (my post above was wrong, new files seem not to trigger this). Modify an existing line, adding ^M at the end of the line (which places it before the NL). Now join the following two lines (deleting the newline) and insert a ^M between them. I'm going to upload a pair of files to which I've done this. This should trigger the bug on your side when you check in the first file and then try to publish a review using the second file as the edit.
-
+ +
Folks were able to post the review using UI after using this workaround: $<our p4 wrapper> diff --strip-trailing-cr > your.html $ upload your html to <RB server>/r/new/
One thing I'm not sure from the repro case here. Is this the source file (pre-patch) that has the funky newlines? Or does the patch add them?
Nevermind, I have a test case now. I just don't have a fix that's working. The problem is that we do newline normalization so that files with lines ending in \r, \n, \r\n, or \r\r\n are processed correctly, but this completely breaks with the patch files. If you have: foo bar^M foo^Mbaz foo And then you add a couple lines, diff will give you: foo +hello bar^M foo^Mbaz foo +world See how diff considers the ^M to be part of the string, not a newline. However, our parser will see that ^M and try to make it a newline, in order to work around other file formats and oddities. This then breaks us, since we'll get a normalized diff that looks like: foo +hello bar foo baz foo +world We may have to process diffs separately, but it's going to be hard to do so without causing new problems. I'd be really interested in finding out how these files are getting so messed up.
-
- NeedInfo + Confirmed
Of course, if we then change the diff so that the ^M is preserved, it'll break against the source input where the ^M is not preserved, and there's no great way to preserve that there. If we do change the ^M to "\n " in order to get a new line in the diff, the lines referenced in the diff will no longer be correct and it may fail to apply. Rock and a hard place.
There's, of course, no question that this is outside of what one would typically expect, but as for "how these files are getting so messed up," imagine that you're reviewing a change to ReviewBoard itself which addresses this problem. It might well include a test case with such a file and subsequent changes to the test case... well, you see where this is going. It is, in fact, purely in a QA context that we're seeing these issues (though that might be partially a result of the versioning differences we have between some parts of our QA environment and other environments using ReviewBoard.
Okay. Well, as it currently stands, I don't know that we can do anything about this without severely breaking other things. As I look into it and think about it, I'm less and less optimistic. Will keep trying for a while though.
Let me ask.. Is the intent of the ^M in these files to be newlines, or are they just ending up this way from broken editors? Are they intended to be there for the sake of "I want a ^M" ?
I believe that, yes, these are intentional in some cases. I also believe that they're unintentional in others. If post-review choked and said "I just can't handle these, mark them as binary or fix the line-endings" that would not be a great solution, but it would at least let users recover. Currently, the way it works, the review ends up with an error on the diffs page after it's been uploaded (and very often, after it's been published as users rarely check the diffs to see if they showed up correctly). This leads to two problems: 1) Reviews that focus on the diff failure rather than the changes 2) User confusion because the error doesn't indicate what went wrong
Okay, so what I can do, potentially, is convert only \r\r\n and \r\n to \n. However, your case above where the file is mostly \n characters, but with a couple extra \r characters immediately preceding the \n characters will still break. Those will be seen as \r\n and converted to \n, which will still break. So, pick your poison, I guess. :/
Is there any way to identify the problem at post-review time and let the user decide what to do about it?
I know your question is now a year old, but I missed it. Sorry :/ Basically, yes, you could in theory have post-review ask, but there's no guarantee of getting it right. This is such a tricky problem and every fix I've tried has caused new problems. I spent a great deal of time on this, and researched how other tools solved it. To summarize: They haven't. At least not when I was working on this last. There were bug reports and complaints for other tools, all involving this same sort of issue, with no great solutions. I think what we need to do is just provide a better error response for broken diffs. Allow the user to see the diff file itself and download it. It may not be obvious that it's the line endings, but it's better than "The patch doesn't apply." As for this bug, I don't think there's anything we can do about it. People should be careful about line endings :)
-
- Confirmed + WontFix