3843: UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 70: ordinal not in range(128)

mark*****@gmai***** (Google Code) (Is this you? Claim this profile.)
david
david
What version are you running?
rbt 0.7.2, I also tested with RBTools 0.8 alpha 0 (dev), i.e. 1ded9b55b3aa2c9bb66f11dd269d4ecba095b9f9 from April 7, 2015

What's the URL of the page containing the problem?
Our local review board server...

What steps will reproduce the problem?
I run "rbt post -d" and pass the "--username" and "--server" options.

What is the expected output? What do you see instead?
I expect rbt post to succeed :)

What operating system are you using? What browser?
Distributor ID:	Ubuntu
Description:	Ubuntu 12.04.4 LTS
Release:	12.04
Codename:	precise

Linux ubuntu 3.2.0-72-generic #107-Ubuntu SMP Thu Nov 6 14:24:01 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

Browser not relevant

Please provide any additional information below.
When I use rbt post to try to post a review I get the error below:
~~~
Traceback (most recent call last):
  File "/usr/local/bin/rbt", line 9, in <module>
    load_entry_point('RBTools==0.8alpha0.dev', 'console_scripts', 'rbt')()
  File "/usr/local/lib/python2.7/dist-packages/RBTools-0.8alpha0.dev-py2.7.egg/rbtools/commands/main.py", line 133, in main
    command.run_from_argv([RB_MAIN, command_name] + args)
  File "/usr/local/lib/python2.7/dist-packages/RBTools-0.8alpha0.dev-py2.7.egg/rbtools/commands/__init__.py", line 555, in run_from_argv
    exit_code = self.main(*args) or 0
  File "/usr/local/lib/python2.7/dist-packages/RBTools-0.8alpha0.dev-py2.7.egg/rbtools/commands/post.py", line 633, in main
    extra_args=extra_args)
  File "/usr/local/lib/python2.7/dist-packages/RBTools-0.8alpha0.dev-py2.7.egg/rbtools/clients/git.py", line 449, in diff
    exclude_patterns)
  File "/usr/local/lib/python2.7/dist-packages/RBTools-0.8alpha0.dev-py2.7.egg/rbtools/clients/git.py", line 559, in make_diff
    return self.make_svn_diff(merge_base, diff_lines)
  File "/usr/local/lib/python2.7/dist-packages/RBTools-0.8alpha0.dev-py2.7.egg/rbtools/clients/git.py", line 621, in make_svn_diff
    diff_data += line
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 70: ordinal not in range(128)
~~~

I can't post the full patch of my review but could anyone comment on whether or not this is a known issue?Also any idea how I can isolate and debug what specific part of the patch is causing the issue?

Thanks!
#1 seide******@gmai***** (Google Code) (Is this you? Claim this profile.)
I'm seeing this as well.  My configuration is

RB 1.7.28 and RB 2.0.15
CentOS6
Perforce 2014

This error does not occur when I post the same diff using RBTools 0.5.2.  The file causing the problem is a plain text file (my .guns file actually).  If you need more info or a traceback, let me know.

--steve

#2 seide******@gmai***** (Google Code) (Is this you? Claim this profile.)
More data points

RB 1.7.28
Python 2.6.6

RBTools 0.5.2  : good
RBTools 0.6.3  : good
RBTools 0.7.0  : bad
RBTools 0.7.1  : bad
RBTools 0.7.2  : bad

If you need help reproducing it, I could probably send you the 2 revisions of my .gnus file that are causing the problem for me.  

--Steve
david
#3 david
  • +PendingReview
  • +Component-RBTools
  • +david
david
#4 david
Fixed in release-0.7.x (219abf7). This will ship in 0.7.3. Thanks!
  • -PendingReview
    +Fixed
#6 paff*****@gmai***** (Google Code) (Is this you? Claim this profile.)
Hello, I have the same problem, not from git.py but from svn.py:

>>> Running: svn diff --non-interactive --diff-cmd=diff --notice-ancestry http://xxxxxxxxxx/@211 http://xxxxxxxxx/@212 --username xxxxxxxx --password xxxxxxxxx
Traceback (most recent call last):
  File "/usr/bin/rbt", line 9, in <module>
    load_entry_point('RBTools==0.7.3alpha0.dev', 'console_scripts', 'rbt')()
  File "/usr/lib/python2.6/site-packages/RBTools-0.7.3alpha0.dev-py2.6.egg/rbtools/commands/main.py", line 133, in main
    command.run_from_argv([RB_MAIN, command_name] + args)
  File "/usr/lib/python2.6/site-packages/RBTools-0.7.3alpha0.dev-py2.6.egg/rbtools/commands/__init__.py", line 580, in run_from_argv
    exit_code = self.main(*args) or 0
  File "/usr/lib/python2.6/site-packages/RBTools-0.7.3alpha0.dev-py2.6.egg/rbtools/commands/post.py", line 640, in main
    extra_args=extra_args)
  File "/usr/lib/python2.6/site-packages/RBTools-0.7.3alpha0.dev-py2.6.egg/rbtools/clients/svn.py", line 369, in diff
    'diff': b''.join(diff),

Maybe more files (svn.py for example) should be adapted to the new diff processing you talked about in your fix? Thanks!
#7 jonatha*******@gmai***** (Google Code) (Is this you? Claim this profile.)
I updated to 0.7.3 and am still seeing this problem. Same line (diff_data += line), same error.

The problem occurs on UTF-16LE files, but I know it worked properly in earlier releases.
#8 jonatha*******@gmai***** (Google Code) (Is this you? Claim this profile.)
I also tried to workaround this by --excludeing the UTF-16 files, but then I get this error:

>>> Running: git -c core.quotepath=false diff-tree --no-color --no-prefix -r -u --no-ext-diff aa277cf885bf1c4c121270b94219c72e529a93b4..167bc60c8e1e128c821d032957404fe24b74773b
Traceback (most recent call last):
  File "/usr/bin/rbt", line 9, in <module>
    load_entry_point('RBTools==0.7.3', 'console_scripts', 'rbt')()
  File "/usr/lib/python2.7/site-packages/rbtools/commands/main.py", line 133, in main
    command.run_from_argv([RB_MAIN, command_name] + args)
  File "/usr/lib/python2.7/site-packages/rbtools/commands/__init__.py", line 612, in run_from_argv
    exit_code = self.main(*args) or 0
  File "/usr/lib/python2.7/site-packages/rbtools/commands/diff.py", line 68, in main
    extra_args=extra_args)
  File "/usr/lib/python2.7/site-packages/rbtools/clients/git.py", line 464, in diff
    exclude_patterns)
  File "/usr/lib/python2.7/site-packages/rbtools/clients/git.py", line 538, in make_diff
    log_output_on_error=False)
  File "/usr/lib/python2.7/site-packages/rbtools/utils/process.py", line 187, in execute
    data = post_process_output(data)
  File "/usr/lib/python2.7/site-packages/rbtools/utils/process.py", line 99, in post_process_output
    return [line.decode('utf-8') for line in output]
  File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xff in position 1: invalid start byte
david
#9 david
Jonathan, can you include the full traceback for the diff_data += line error?
  • -Fixed
    +New
#10 jonatha*******@gmai***** (Google Code) (Is this you? Claim this profile.)
Sorry, I keep forgetting to check while I'm at work. I know it was the same line in git.py:make_svn_diff, so I'm pretty sure the traceback would be exactly the same.
david
#11 david
  • -New
    +Fixed
  • -david
    +david
#12 subodhk

With RBTools 0.7.5 running on perforce we are still getting this issue

Generating diff for pending changeset 550981
CRITICAL: 'ascii' codec can't decode byte 0xe2 in position 25: ordinal not in range(128)

Python version. Python 2.7.3 - 2.7.11
Ubuntu and FC4

david
#13 david

Please run with --debug and post the traceback here.

#14 subodhk

Found the issue, we have mixed description in DB where some are ascii and some utf-8. We have a customization where we change the change description and commit it to perforce.

There was an issue in our code where we were handling the description as string. It worked if the encoding was ascii as default python encoding is ascii but when the description had utf-8 encoding we were not decoding it as utf-8. Since we were doing some manipulation on the string like regex search and replace and split it was throwing errors.

Therefore whereever we found the issue we fixed it and now we got a utf-8 encoded changelist description which we wanted to update to perforce

For that we use amend_commit_description. Problem with this function is uses plain text therefore it doesn't do encoding

p.communicate(input_string)

Preparing to amend change --rbtools-pending-cln:549348
Running: p4 change -o 549348
Traceback (most recent call last):
File "/tmp/rbtool-for-mac/bin/rbt", line 24, in <module>
load_entry_point(version_string, 'console_scripts', 'rbt')()
File "/tmp/rbtool-for-mac/lib/RBTools/RBTools-0.7.5-py2.7.egg/rbtools/commands/main.py", line 133, in main
command.run_from_argv([RB_MAIN, command_name] + args)
File "/tmp/rbtool-for-mac/lib/RBTools/RBTools-0.7.5-py2.7.egg/rbtools/commands/init.py", line 622, in run_from_argv
exit_code = self.main(*args) or 0
File "/tmp/rbtool-for-mac/lib/RBTools/RBTools-0.7.5-py2.7.egg/rbtools/commands/post.py", line 941, in main
base_dir=base_dir)
File "/tmp/rbtool-for-mac/lib/RBTools/RBTools-0.7.5-py2.7.egg/rbtools/commands/post.py", line 537, in post_request
self.add_review_info(changenum, review_request.absolute_url.replace('.com', '.com'))
File "/tmp/rbtool-for-mac/lib/RBTools/RBTools-0.7.5-py2.7.egg/rbtools/commands/post.py", line 657, in add_review_info
self.tool.amend_commit_description(change_dict['Description'], self.revisions)
File "/tmp/rbtool-for-mac/lib/RBTools/RBTools-0.7.5-py2.7.egg/rbtools/clients/perforce.py", line 1517, in amend_commit_description
self.p4.modify_change(new_change)
File "/tmp/rbtool-for-mac/lib/RBTools/RBTools-0.7.5-py2.7.egg/rbtools/clients/perforce.py", line 53, in modify_change
return self.run_p4(['change', '-i'], input_string=new_change_spec)
File "/tmp/rbtool-for-mac/lib/RBTools/RBTools-0.7.5-py2.7.egg/rbtools/clients/perforce.py", line 160, in run_p4
p.communicate(input_string) # Send input, wait, set returncode
File "/usr/lib/python2.7/subprocess.py", line 740, in communicate
self.stdin.write(input)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2013' in position 1671: ordinal not in range(128)

So to fix this error I encoded the input string as below
p.communicate(input_string.encode('utf-8'))