4002: Perforce diff submission is broken in 2.0.18

a.tinsmith

What version are you running?

ReviewBoard 2.0.18 and RBTools 0.7.4
ReviewBoard was installed from CentoOS 7 epel repository.

What's the URL of the page containing the problem?

http://<company_internal_host>/r/new/

What steps will reproduce the problem?

  1. p4 diff -Od -du //repo/path/...@123 //repo/path/...@345 > my_ticket.diff
  2. Start new review request via web interface.
  3. Upload my_ticket.diff

What is the expected output? What do you see instead?

Expected: New review request created.

Actual: Failes with error: ValueError: need more than 1 value to unpack

Contents of the log file:

2015-10-26 22:46:49,298 - ERROR - - Unexpected error when validating diff.
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/reviewboard/webapi/resources/validate_diff.py", line 154, in create
save=False)
File "/usr/lib/python2.7/site-packages/reviewboard/diffviewer/managers.py", line 156, in create_from_upload
save=save)
File "/usr/lib/python2.7/site-packages/reviewboard/diffviewer/managers.py", line 180, in create_from_data
check_existence=(not parent_diff_file_contents)))
File "/usr/lib/python2.7/site-packages/reviewboard/diffviewer/managers.py", line 301, in _process_files
copied=f.copied)
File "/usr/lib/python2.7/site-packages/reviewboard/scmtools/perforce.py", line 333, in parse_diff_revision
filename, revision = revision_str.rsplit('#', 1)
ValueError: need more than 1 value to unpack


My custom debug log, which added to `PerforceTool.parse_diff_revision' method in the perforce.py shows this:

2015-10-26 22:46:49,298 - DEBUG - - PerforceTool.parse_diff_revision: file_str: "//repo/path/file.cpp"; revision_str: "2015-10-13 06:15:57.000000000 -0700"

What operating system are you using? What browser?

Windows 7 64-bit, Chrome 46

Please provide any additional information below.

Perforce client:

$ p4 -V
Perforce - The Fast Software Configuration Management System.
Copyright 1995-2015 Perforce Software. All rights reserved.
This product includes software developed by the OpenSSL Project
for use in the OpenSSL Toolkit (http://www.openssl.org/)
See 'p4 help legal' for full OpenSSL license information
Version of OpenSSL Libraries: OpenSSL 1.0.1p 9 Jul 2015
Rev. P4/LINUX26X86_64/2015.1/1240625 (2015/09/29).


Perforce server:

$ p4 info
...
Server version: P4D/SOLARIS10X86_64/2015.1/1204891 (2015/07/17)
...


The bug is similar to this one: https://hellosplat.com/s/beanbag/tickets/3887/ (Perforce diff submission is broken in 2.0.17)

Perforce generates valid unified diff format. (See description and examples here: http://www.gnu.org/software/diffutils/manual/html_node/Unified-Format.html)
ReviewBoard is unable to digest it. For some reason ReviewBoard assumes that file paths in the diff will contains Perforce revisions. This assumption is incorrect. Diff headers contain file paths (in repo format) and timestamp, as required by unified diff spec.

I've made a quick hack just to make ReviewBoard work. I'm not familiar with ReviewBoard code, so most likely this hack is not good for production. In any case here is the diff:


--- /usr/lib/python2.7/site-packages/reviewboard/scmtools/perforce.py.orig 2015-10-27 10:17:56.582356699 +1100
+++ /usr/lib/python2.7/site-packages/reviewboard/scmtools/perforce.py.1 2015-10-27 10:42:44.901743380 +1100
@@ -214,7 +214,7 @@
elif revision == HEAD:
depot_path = path
else:
- depot_path = '%s#%s' % (path, revision)
+ depot_path = '%s%s' % (path, revision)

     res = self.p4.run_print('-q', depot_path)
     if res:

@@ -329,7 +329,18 @@
def parse_diff_revision(self, file_str, revision_str, args, *kwargs):
# Perforce has this lovely idiosyncracy that diffs show revision #1
# both for pre-creation and when there's an actual revision.
- filename, revision = revision_str.rsplit('#', 1)
+ if '#' in revision_str:
+ filename, revision = revision_str.rsplit('#', 1)
+ revision = '#' + revision
+ else:
+ m = re.search('(?P<Y>\d{4})-(?P<M>\d{2})-(?P<D>\d{2})\s+(?P<time>\d{2}:\d{2}:\d{2})\.\d{9}', revision_str)
+ if m:
+ filename = file_str
+ revision = '@%s/%s/%s:%s' % (m.group('Y'), m.group('M'), m.group('D'), m.group('time'))
+ revision_str = file_str + revision
+ else:
+ raise ValueError('Failed to parse file revision from supplied diff.')
+
if len(self.client.get_files_at_revision(revision_str)) == 0:
revision = PRE_CREATION
return filename, revision


As you can see, I parse the timestamp and convert it into P4 datespec. Then I use this datespec as a revision to retrieve the original file.

Another issue: PerforceDiffParser class is broken. This is the regex it uses to parse Perforce diffs:

SPECIAL_REGEX = re.compile("^==== ([^#]+)#(\d+) ==([AMD]|MV)== (.*) ====$")

This is incorrect. Perforce diff header is documented here: https://www.perforce.com/perforce/doc.current/manuals/cmdref/p4_diff2.html


==== file1 (filetype1) - file2 (filetype2) ==== summary

Here are some examples I obtained from our repo:

deleted files
==== //repo/path/file.cpp#2 - <none> ===
==== //repo/path/file.h#1 - <none> ===

changed
==== //repo/path/file.cpp#38 (text) - //repo/path/file.cpp#39 (text) ==== content

new file
==== <none> - //repo/path/file.py#1 ====

moved
==== <none> - //repo/path_new/file_moved.txt#1 ====
==== //repo/path_old/file.txt#1 - <none> ===

Your regex will never be able to recognize them. Please fix it.

This fix is important for our organization, since it's the blocking issue for us to adopt ReviewBoard.

#1 a.tinsmith
  • +
    --- /usr/lib/python2.7/site-packages/reviewboard/scmtools/perforce.py.orig	2015-10-27 10:17:56.582356699 +1100
    +++ /usr/lib/python2.7/site-packages/reviewboard/scmtools/perforce.py.1	2015-10-27 10:42:44.901743380 +1100
    @@ -214,7 +214,7 @@
             elif revision == HEAD:
                 depot_path = path
             else:
    -            depot_path = '%s#%s' % (path, revision)
    +            depot_path = '%s%s' % (path, revision)
     
             res = self.p4.run_print('-q', depot_path)
             if res:
    @@ -329,7 +329,18 @@
         def parse_diff_revision(self, file_str, revision_str, *args, **kwargs):
             # Perforce has this lovely idiosyncracy that diffs show revision #1
             # both for pre-creation and when there's an actual revision.
    -        filename, revision = revision_str.rsplit('#', 1)
    +        if '#' in revision_str:
    +            filename, revision = revision_str.rsplit('#', 1)
    +            revision = '#' + revision
    +        else:
    +            m = re.search('(?P<Y>\\d{4})-(?P<M>\\d{2})-
chipx86
#2 chipx86

Let me provide some background on this.

Historically, p4 diffs didn't contain the information needed to properly look up the files from the repository. There was a lack of revision information and metadata to help with handling moved/renamed/copied files. p4 diff also simply didn't generate diffs in certain situations, especially for older versions. (I believe this is still the case today.)

One of Review Board's first users had a workaround for this, using a custom diffing script, and a custom header format was created that encoded the information needed for this. This was adopted in Review Board as part of RBTools.

Perforce, it seems, has eventually added a header format, though the information in the documentation indicates that it's still nowhere near as expressive as the format we're using, meaning we wouldn't be able to properly show certain operations.

For Perforce, RBTools usage is a requirement. Ideally, you should never post a diff through the web UI anyway. Using RBTools, you can simply create or update a review request by simply typing:

$ rbt post <changenum>

That creates a review request with the diff from the changeset, and will set the summary/description based on your change description. Far easier than going through the web UI.

You can download RBTools for Windows, Mac, and Linux.

chipx86
#3 chipx86

Also, I'm open to taking a patch that supports Perforce's standard diff format, but you're probably better off using RBTools. However, if you wanted to make a project out of this, patches need to be posted at https://reviews.reviewboard.org/, need to continue to support the existing format, need to be modify the PerforceDiffParser (which is where the logic must be for that parsing), and needs a set of new unit tests for all possible conditions in the header.

#4 a.tinsmith

Christian, thank you for the elaborate response. You're right, we have to use RBTools with Perforce, there is no simple alternative.

Most of our changes span over several changelists, that's why we need to use revision ranges rather than single revision number. Curren;ty I'm trying to intergrate RBTools with P4V client, so it's easier to create reviews. Most people in my team use P4V exclusively and don't like the idea that they have to setup command line tools+environemt to post reviews. I wish RB could pick up Perforce changelists directly from the web UI as it does for Git. Anyway, thanks for the hard work and making this project going.

chipx86
#5 chipx86

RBTools can handle the revision ranges, which should help, using standard Perforce syntax. However, not being tied to a particular changeset, it won't be able to auto-find the matching review request without specifying the -u argument or -r <review_request_id> along with the range.

We know some people have set up P4V to use RBTools, though we don't have any official guide for it. However, it should be possible to do (and if anyone on your team goes that route, I'd love some steps/documentation on how you put it together!).

I seem to recall there's some reason we didn't make the New Review Request page show commits for Perforce, but I'll find out. Even with that, though, it'd only be able to post changes for review that already exist in the repository. We generally recommend posting changes for review prior to landing in the repository, to ensure others don't build upon defective code (as it's harder to replace code with new changes from a new review when others have already started depending on those changes).

#6 a.tinsmith

Yes, we perform code reviews before we commit changes to the common trunk. Developers use other branches, where commits are allowed and changes naturally can grow into several commits.

I'll try to make it work with P4V and if it works well will post the guide.

david
#7 david
  • -New
    +NotABug