4638: Binary patch will be corrupted

Misery

What version of Djblets are you using?

djblets 1.0.2
RB 3.0.2

Which module(s) have the problem?

Download diff (http://localhost/r/38/diff/raw/)

What steps will reproduce the problem?

  1. Go to http://localhost/r/new/
  2. Select a repo (used Mercurial repo here)
  3. Select attached diff and upload manually

Same happens with rbtools!

What is the expected output? What do you see instead?

After download the binary diff is broken and cannot be applied.

What version of Python and Django?

1.6.11.6 and python 2.7.14

Please provide any additional information below.

The git diff binary literal will be splitted. In attached diff the literal is correctly placed. After upload and download a part of the literal will be always top and the rest is somewhere in the diff.

Initial:

diff --git a/A b/A
new file mode 100644
--- /dev/null
+++ b/A
@@ -0,0 +1,3 @@
+sdf
+sfd
+
diff --git a/bla.gif b/bla.gif
new file mode 100644
index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..3288d1035d70bb86517e2c233f1a904e41f06b29
GIT binary patch
literal 3208
zc$~$TX;4#H9>pJdFE7h`I{03&0|5<4L}(j=LYh^D009EB2nYfqF)E0PvIqo$u!IC;
...
uD}@id>)yD*7MNAu68;M0-aIg_eY_mcH^scvB+q+?5whc40$l(7mG}qF=&jWN

diff --git a/bla.java b/bla.java
new file mode 100644
--- /dev/null
+++ b/bla.java
@@ -0,0 +1,7 @@
+class Main
+{
+ static public void a()
+ {
+
+ }
+}

Broken:

literal 3208
zc$~$TX;4#H9>pJdFE7h`I{03&0|5<4L}(j=LYh^D009EB2nYfqF)E0PvIqo$u!IC;
...
uD}@id>)yD*7MNAu68;M0-aIg_eY_mcH^scvB+q+?5whc40$l(7mG}qF=&jWN

diff --git a/bla.java b/bla.java
new file mode 100644
--- /dev/null
+++ b/bla.java
@@ -0,0 +1,7 @@
+class Main
+{
+ static public void a()
+ {
+
+ }
+}
diff --git a/bla.gif b/bla.gif
new file mode 100644
index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..3288d1035d70bb86517e2c233f1a904e41f06b29
GIT binary patch
diff --git a/A b/A
new file mode 100644
--- /dev/null
+++ b/A
@@ -0,0 +1,3 @@
+sdf
+sfd
+

Misery
#1 Misery
  • +
    diff --git a/A b/A
    new file mode 100644
    --- /dev/null
    +++ b/A
    @@ -0,0 +1,3 @@
    +sdf
    +sfd
    +
    diff --git a/bla.gif b/bla.gif
    new file mode 100644
    index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..3288d1035d70bb86517e2c233f1a904e41f06b29
    GIT binary patch
    literal 3208
    zc$~$TX;4#H9>pJdFE7h`I{03&0|5<4L}(j=LYh^D009EB2nYfqF)E0PvIqo$u!IC;
    z4Pgx^2_%MSi-;n)HH!#pV`001_UHg@yWrlIJ$6smIN;Pw%?CbA)l~J|kGJlvTfg%^
    z=Tu#upO2GsJQH97?*ZV`r%wO?T)ld=sHmu?r>Ci@iNoPUL_|zZPCkD8_~ONj%a$#J
    zAPD%^GXI2Z^bXs^^$7M}W6K5&=C}TC!cEx`pSDBD%t%a2V8-u`&)kz7FSJeEw=)?q
    z{}rN8=T5Djzdipb06jGv^-GM`%#J-3@FkWlKB!S!D8WM!c6v2Qvu4wb4cpPq?%H1U
    zesfXHz;|q8f}yMBFq>=8Y)kO7G5_6{yUJt3l4Jv@(0r|PnB)1FTTa%xMbg%71Fz4V
    z7Sc!=nLPI>h`Ngg6=wJfip>tf9`15BB_gIdM&?Q(5W#j$^HEogUg@hVRUbdm(rwnj
    z?m3&rs`gDSIh+#Z=9_1mj}MoetD{_&3%<p-m<UFKymz@k4SBq%SlfR02OfV33TaL)
    zege$K2!;}DonN}A?)s2-WP92B9W!RwqzzViuXrc$wHL4u`|Lb>RQv4Mu_zx$T8mHB
    z9d#%?qzDtxdFTBl7y4hG2++3(r%m3jltk^#WIU@-453C4GX|y6HZ&Mx;@JVCCMFp=
    z1=-VoUm051CC7a+s)K=j$eAAwMZm@^f(SZDgYs7FW=p9Rv^{0KUD@nhu*5CjjZze?
    zj
Misery
#2 Misery

Same happens if repository is using "git".

chipx86
#3 chipx86

Binary patches aren't supported. In Git, they default to not being included unless --binary is passed or an equivalent option is in .gitconfig (which we should turn off). For the new Git diff support in Mercurial, we will need to determine how to turn this off as well. And in Review Board, we should not break when encountering a binary diff.

So there's three tasks here, which we'll split up and track internally:

  1. git diff should be passed some flag to turn off binary files. There doesn't seem to be an equivalent to --binary, though, so we might have to do something a bit more complex.
  2. We should figure out if Mercurial is always generating diffs for binary files, and see if that behavior can be turned off. Doesn't look like it can, at first glance, but perhaps?
  3. Review Board's Git diff parser should be able to handle the payload for binary files.
Misery
#4 Misery

Well, I don't see the reason why rbtools should be modified to cut those binary literals. Every other WebAPI-Access or manually generated diff could have an embedded binary. That should be handled by reviewboard only. We use rbtools to generate the diff and use it for some additional stuff. So it would strip off any binary here. By the way... git-diff-style on Mercurial was possible before my change with a change in user-hgrc. But that required explicit user action.

I understand that reviewboard could disable the usage of binaries because it is really complicated. But basic support for binaries would help us for our CI system as we cannot check any diffs with a binary on jenkins [1] since reviewboard provides broken diffs for binaries - stripped diff is the same problem. Provide the latest binary would be enough. No interdiff or binary patching is required. Why not have an option that must be explicitly enabled to store binaries without further processing?

[1] https://github.com/vmware/jenkins-reviewbot

chipx86
#5 chipx86

Valid points. The reason I had thought to disable it on the RBTools side as well for posting is because Review Board doesn't make use of it, and it would save on storage, bandwidth, and posting times. I think we'll leave it at fixing the Review Board side to just not blow up (it didn't used to, but clearly it does now).

By the way, we're planning to ship official Jenkins support down the road, which will be a bit easier to configure and manage. The new version will support the new multi-config integrations capabilities in Review Board 3.0, along with the features we added for automated code review (such as status updates).

Misery
#6 Misery

Thanks!
You can ping me if you have any state of your jenkins extension. I will give it a try here and see if it fits to our CI. :-)