4728: Git svn : diff can't be generated because of badly formatted call to git rev-list

KG
david
david

What version are you running?

RBTools 1.0.1, first seen with 1.0
Git : 1.8.3.1

Using Git-svn. Branch master is synchronized with SVN repository, no commits on it. Working on branch.

What steps will reproduce the problem?

  1. Call rbt post without options on the developing branch

What is the expected output? What do you see instead?

Exception raised, no diff nor review created.
Should have a review created with correct diff.

What operating system are you using?

CentOS 7.5

Attach the debug out from the command.

bash-4.2$ rbt post --debug

RBTools 1.0.1
Python 2.7.5 (default, Jul 13 2018, 13:06:57)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-28)]
Running on Linux-3.10.0-862.9.1.el7.x86_64-x86_64-with-centos-7.5.1804-Core
Home = myhome
Current directory = {my_directory}/trunk
Command line: rbt post --debug
Running: tf vc help
Checking for a Subversion repository...
Running: svn --non-interactive info
Command exited with rc 1: [u'svn', u'--non-interactive', u'info']
Checking for a Git repository...
Running: git rev-parse --git-dir
Running: git config core.bare
Running: git rev-parse --show-toplevel
Running: git symbolic-ref -q HEAD
Running: git svn info
Repository info: Path: {correct}, Base path: /trunk, Supports changesets: False
Checking for a Mercurial repository...
Running: hg root
Command exited with rc 255: [u'hg', u'root']
abort: no repository found in '{my_directory}/trunk' (.hg not found)!


Checking for a CVS repository...
Unable to execute "cvs": skipping CVS
Checking for a Perforce repository...
Unable to execute "p4 help": skipping Perforce
Checking for a Plastic repository...
Unable to execute "cm version": skipping Plastic
Checking for a ClearCase repository...
Unable to execute "cleartool help": skipping ClearCase
Checking for a Bazaar repository...
Unable to execute "bzr help": skipping Bazaar
Checking for a Team Foundation Server repository...
Unable to execute "tf help": skipping TFS
Running: git config --get reviewboard.url
Making HTTP GET request to {reviewboard_server}/api/
Making HTTP GET request to {reviewboard_server}/api/repositories/?tool=Subversion
Cached response for HTTP GET {reviewboard_server}/api/repositories/?tool=Subversion expired and was modified
Running: git rev-parse refs/heads/dev
Running: git svn rebase -n
Running: git rev-parse refs/remotes/trunk
Running: git rev-list SHA1_lastcommit_on_master --not --remotes=origin
Running: git rev-parse SHA2_first_commit_on_remote^
Command exited with rc 128: [u'git', u'rev-parse', u'SHA2_first_commit_on_remote^']
fatal: ambiguous argument 'SHA2_first_commit_on_remote^': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
SHA2_first_commit_on_remote^


Traceback (most recent call last):
File "/usr/bin/rbt", line 9, in <module>
load_entry_point('RBTools==1.0.1', 'console_scripts', 'rbt')()
File "/usr/lib/python2.7/site-packages/RBTools-1.0.1-py2.7.egg/rbtools/commands/main.py", line 120, in main
command.run_from_argv([RB_MAIN, command_name] + args)
File "/usr/lib/python2.7/site-packages/RBTools-1.0.1-py2.7.egg/rbtools/commands/init.py", line 719, in run_from_argv
exit_code = self.main(args) or 0
File "/usr/lib/python2.7/site-packages/RBTools-1.0.1-py2.7.egg/rbtools/commands/post.py", line 793, in main
self.revisions = get_revisions(self.tool, self.cmd_args)
File "/usr/lib/python2.7/site-packages/RBTools-1.0.1-py2.7.egg/rbtools/utils/review_request.py", line 67, in get_revisions
revisions = tool.parse_revision_spec(cmd_args)
File "/usr/lib/python2.7/site-packages/RBTools-1.0.1-py2.7.egg/rbtools/clients/git.py", line 144, in parse_revision_spec
parent_ref, 'origin')
File "/usr/lib/python2.7/site-packages/RBTools-1.0.1-py2.7.egg/rbtools/clients/git.py", line 568, in _rev_list_youngest_remote_ancestor
youngest_remote_commit = self._rev_parse('%s^' % local_commit)[0]
File "/usr/lib/python2.7/site-packages/RBTools-1.0.1-py2.7.egg/rbtools/clients/git.py", line 536, in _rev_parse
revisions = self._execute([self.git, 'rev-parse'] + revisions)
File "/usr/lib/python2.7/site-packages/RBTools-1.0.1-py2.7.egg/rbtools/clients/git.py", line 1214, in _execute
return execute(cmdline, cwd=self._git_toplevel,
args, **kwargs)
File "/usr/lib/python2.7/site-packages/RBTools-1.0.1-py2.7.egg/rbtools/utils/process.py", line 185, in execute
raise Exception('Failed to execute command: %s' % command)
Exception: Failed to execute command: [u'git', u'rev-parse', u'SHA2_first_commit_on_remote^']

Please provide any additional information below.

git rev-list SHA1_lastcommit_on_master --not --remotes=origin
seems to be wrong : options specified after SHA1 are not taken into account

Fixed by patching /usr/lib/python2.7/site-packages/RBTools-1.0.1-py2.7.egg/rbtools/clients/git.py
to reorder correctly the command line, to obtain
git rev-list --not --remotes=origin SHA1_lastcommit_on_master
=> no more problems

chipx86
#1 chipx86

Thanks! Definitely had this wrong. We'll get a fix into 1.0.2.

  • -New
    +Confirmed
  • +SCM:Git
  • +david
chipx86
#2 chipx86
  • +Release-1.0.x
david
#3 david

This seems to work ok using more modern versions of git. I suspect we could change it to be git rev-list --not --remotes=origin --not SHA1_lastcommit_on_master (note the second --not) but I'm not sure that that will work with Git 1.8. Can you verify?

  • -Confirmed
    +NeedInfo
#4 KG

modifying git.py to have "git rev-list --not --remotes=origin --not SHA1_lastcommit_on_master" called doesn't work. Same backtrace as before.
Calling it directly in a terminal gives me the list of all commits of the repository.

#5 rfs613

I encountered this problem also, and have two additional data points to add:

1) If one does a fresh "git svn clone" operation, followed by "rbt post" in the resulting repository, the git rev-parse SHA2_first_commit_on_remote^ problem does not occurr. It only fails in an older repo created by an older git-svn. (Aside: why would it be asking for a commit older than the first commit?)

2) However there is a subsequent failure, which occurs for both rbt post and rbt diff. The issue is that git svn find-rev returns a string, rather than bytes. This causes make_svn_diff to abort:

>>> RBTools 1.0.1                                                                                                                                                                                                                                                                           
>>> Python 3.6.6 (default, Jul 19 2018, 14:25:17)                                                                                                                                                                                                                                           
[GCC 8.1.1 20180712 (Red Hat 8.1.1-5)]
>>> Running on Linux-4.18.9-200.fc28.x86_64-x86_64-with-fedora-28-Twenty_Eight
[...snip...]
>>> Running: git rev-parse refs/heads/master
>>> Running: git rev-parse origin/trunk
>>> Running: git rev-list 25928235d653cdfb768073a943b044cd828d7203 --not --remotes=origin
>>> Running: git status --porcelain --untracked-files=no --ignore-submodules=dirty
>>> Running: git version
>>> Running: git -c core.quotepath=false diff --no-color --no-prefix -r -u --no-ext-diff 25928235d653cdfb768073a943b044cd828d7203..cd4f09a13b2c0c9b95e1bd5baa1e5ec1e2de134e
>>> Running: git svn find-rev 25928235d653cdfb768073a943b044cd828d7203
Traceback (most recent call last):
  File "/usr/bin/rbt", line 11, in <module>
    load_entry_point('RBTools==1.0.1', 'console_scripts', 'rbt')()
  File "/usr/lib/python3.6/site-packages/rbtools/commands/main.py", line 120, in main
    command.run_from_argv([RB_MAIN, command_name] + args)
  File "/usr/lib/python3.6/site-packages/rbtools/commands/__init__.py", line 719, in run_from_argv
    exit_code = self.main(*args) or 0
  File "/usr/lib/python3.6/site-packages/rbtools/commands/diff.py", line 70, in main
    extra_args=extra_args)
  File "/usr/lib/python3.6/site-packages/rbtools/clients/git.py", line 634, in diff
    exclude_patterns)
  File "/usr/lib/python3.6/site-packages/rbtools/clients/git.py", line 778, in make_diff
    return self.make_svn_diff(merge_base, diff_lines)
  File "/usr/lib/python3.6/site-packages/rbtools/clients/git.py", line 834, in make_svn_diff
    diff_data += b'--- %s\t(revision %s)\n' % (original_file, rev)
TypeError: %b requires a bytes-like object, or an object that implements __bytes__, not 'str'

I don't know if it is the right place to fix it, but the following change at the beginning of make_svn_diff solves the problem:

        Returns:
            bytes:
            The reformatted diff contents.
        """
        rev = self._execute([self.git, 'svn', 'find-rev', merge_base]).strip()
+       rev = rev.encode()  #todo: make it safe for python2.7 also

        if not rev:
            return None

With the above change, I am able to successfully rbt post from a git-svn repo.

#6 splatter2

I believe the real problem here is the fact that the remote is hardcoded to origin. i.e. _rev_list_youngest_remote_ancestor() is always called with origin as the third parameter.
This only works with git > 2.0 and when using a standard layout for svn. Prior to git 2.0, git-svn did not use a standard prefix (see git svn man page or https://git-scm.com/docs/git-svn#git-svn---prefixltprefixgt). Even with git 2.x, depending on the layout of the svn repository and how the clone was created, a prefix of origin might not exist.

The order of git rev-list SHA1_lastcommit_on_master --not --remotes=origin is a red herring. There is nothing wrong with the current order. Reversing the order as @KG suggested can sometimes fix the problem, but it only works for a different reason, not the way that the _rev_list_youngest_remote_ancestor() function was intended.

The real fix would be to make the name of the remote configurable so that it can work in non-standard situations. A better approach would be to automatically detect the name of the remote svn prefix, but I don't think that is trivial with git-svn. Although the documentation for _rev_list_youngest_remote_ancestor() states that the remote parameter is configurable, that is not true of the current implementation.

A workaround for those who have already checked out using git-svn 1.x and do not have a standard prefix of origin is to manually change the ref/remotes/ prefix to origin.

The issue the @rfs613 mentioned about the TypeError is still a valid issue, but it only affect python 3. I don't encounter this issue on python 2.7

#7 splatter2

Looking at this a little more, I checked the code of the old version that worked (0.7.11) and noticed that it was figuring out the name of the remote ref for git-svn by parsing the output of git svn rebase -n (see here: https://github.com/reviewboard/rbtools/blob/release-0.7.11/rbtools/clients/git.py#L267). I think that would be be the best way to do this instead of hardcoding origin.

Also note that the --remotes= argument to git rev-list will not work in all cases. Since git rev-parse automatically turns the argument into a wildcard prefix match by appending /* to the argument of --remotes=, it will only work of the argument given contains directories or branches beneath it. In some case, the remote is a single ref file which will cause it to not act as intended. Note that if the ref given for --remotes=prefix does not exist git will not issue an error, it will just ignore that parameter. I think the proper way to do this for git-svn would be to determine the upstream ref like 0.7.11 did, and just use that as a --not SHA1 argument instead of the --remotes= to cover all cases.

#8 rfs613

I can confirm the workaround in comment #6 by @splatter2 works in an old repository. Thanks for that! Seems we are one step closer to solving this.

I did also notice that svn rebase -n is in fact being called. Perhaps the old logic from 0.7.11 is still there after all (maybe just being used differently)?

>>> RBTools 1.0.1          
>>> Python 3.7.1 (default, Oct 23 2018, 18:19:07)
[GCC 8.2.1 20180801 (Red Hat 8.2.1-2)]                                                   
>>> Running on Linux-4.18.16-300.fc29.x86_64-x86_64-with-fedora-29-Twenty_Nine
[...snip...]
>>> Making HTTP GET request to http://reviewboard.local/
>>> Running: git rev-parse refs/heads/master
>>> Running: git svn rebase -n
>>> Running: git rev-parse refs/remotes/trunk
>>> Running: git rev-list cb43270923352b3dc090410113c060961179215a --not --remotes=origin
>>> Running: git status --porcelain --untracked-files=no --ignore-submodules=dirty
>>> Running: git version
>>> Running: git -c core.quotepath=false diff --no-color --no-prefix -r -u --no-ext-diff cb43270923352b3dc090410113c060961179215a..436b0827d5acdbe2b03eab1081dd5881b54cf88f
>>> Running: git svn find-rev cb43270923352b3dc090410113c060961179215a
[...]
#9 splatter2

@rfs613, You are correct, git svn rebase -n is still being called. The logic is now encapsulated in rbtools/clients/_get_parent_branch(). I had not noticed that. However, that value is not being over here. I think it would be pretty easy to fix this issue with the following patch to rbtools/clients/git.py

@@ -141,7 +141,7 @@ class GitClient(SCMClient):
             parent_ref = self._rev_parse(self._get_parent_branch())[0]

             merge_base = self._rev_list_youngest_remote_ancestor(
-                parent_ref, 'origin')
+                parent_ref, self._get_parent_branch())

             result = {
                 'base': parent_ref,
@@ -208,7 +208,7 @@ class GitClient(SCMClient):
                     'Unexpected result while parsing revision spec')

             parent_base = self._rev_list_youngest_remote_ancestor(
-                result['base'], 'origin')
+                result['base'], self._get_parent_branch())
             if parent_base != result['base']:
                 result['parent_base'] = parent_base
         else:
@@ -556,8 +556,7 @@ class GitClient(SCMClient):
             parent links).
         """
         local_commits = self._execute(
-            [self.git, 'rev-list', local_branch, '--not',
-             '--remotes=%s' % remote])
+            [self.git, 'rev-list', local_branch, '--not', remote])
         local_commits = local_commits.split()

         if local_commits == []:

I have not tested this extensively, but it appears to work in simple cases of git-svn and git repositories.