4786: Cannot post diff with German Umlauts


What version are you running?


What's the URL of the page containing the problem?


What steps will reproduce the problem?

  1. Create a patch on Windows with a filename and German Umlauts
  2. Push it to RB
  3. Stacktrace on RB

What is the expected output? What do you see instead?

2019-01-30 14:35:59,780 - ERROR - None - hook - /api/review-requests/25065/draft/diffs/ - root - Error uploading new diff: 'utf8' codec can't decode byte 0xfc in position 56: invalid start byte
Traceback (most recent call last):
File "/opt/reviewboard/dist/lib/python2.7/site-packages/reviewboard/webapi/resources/diff.py", line 294, in create
File "/opt/reviewboard/dist/lib/python2.7/site-packages/reviewboard/reviews/forms.py", line 150, in create
File "/opt/reviewboard/dist/lib/python2.7/site-packages/reviewboard/diffviewer/forms.py", line 69, in create
File "/opt/reviewboard/dist/lib/python2.7/site-packages/reviewboard/diffviewer/managers.py", line 509, in create_from_upload
File "/opt/reviewboard/dist/lib/python2.7/site-packages/reviewboard/diffviewer/managers.py", line 614, in create_from_data
check_existence=check_existence and not parent_diff_file_contents))
File "/opt/reviewboard/dist/lib/python2.7/site-packages/reviewboard/diffviewer/managers.py", line 745, in _process_files
for f in parser.parse():
File "/opt/reviewboard/dist/lib/python2.7/site-packages/reviewboard/scmtools/hg.py", line 246, in parse
return super(HgGitDiffParser, self).parse()
File "/opt/reviewboard/dist/lib/python2.7/site-packages/reviewboard/scmtools/git.py", line 267, in parse
next_i, file_info, new_diff = self._parse_diff(i)
File "/opt/reviewboard/dist/lib/python2.7/site-packages/reviewboard/scmtools/git.py", line 311, in _parse_diff
line, info = self._parse_git_diff(linenum)
File "/opt/reviewboard/dist/lib/python2.7/site-packages/reviewboard/scmtools/git.py", line 469, in _parse_git_diff
file_info.origFile = file_info.origFile.decode('utf-8')
File "/opt/reviewboard/dist/lib/python2.7/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xfc in position 56: invalid start byte

Please provide any additional information below.

A filename used German Umlauts. Also this filename was used in an XML file. Both are changed/renamed in this changeset to "ue" instead of "ü".