3666: rbtools thresholds to warn and/or prevent uploading huge diff

nan****@gmai***** (Google Code) (Is this you? Claim this profile.)
What version are you running?
rbtools 0.6.x
reviewboard 1.7.6


What's the URL of the page this enhancement relates to, if any?


Describe the enhancement and the motivation for it.
On repository containing huge SLOC, diff size can weight from 0 to size of all files, or more if private files to be reviewed.
It can be a mistake (so complete deletion can be done) or wish to review a repo from scratch.
Uploading huge diff has an impact on database, at least seen on mysql. Indeed diffviewer_filediffdata table grows quickly and there is a django request that will select all binary_hash and binary rows from this table and so requires to get at least RAM+SWAP >= diffviewer_filediffdata table size + http server needs ...

mysql backtrace
                   63 Query     SELECT COUNT(*) FROM `scmtools_repository` LEFT OUTER JOIN `site_localsite` ON (`scmtools_repository`.`local_site_id` = `site_localsite`.`id`) WHERE `site_localsite`.`id` IS NULL

                   63 Query     SELECT `diffviewer_diffsethistory`.`id`, `diffviewer_diffsethistory`.`name`, `diffviewer_diffsethistory`.`timestamp`, `diffviewer_diffsethistory`.`last_diff_updated` FROM `diffviewer_diffsethistory` WHERE `diffviewer_diffsethistory`.`id` = 9690

                   63 Query     SELECT COUNT(*) FROM `diffviewer_diffset` WHERE `diffviewer_diffset`.`history_id` = 9690

                   63 Query     SELECT `scmtools_repository`.`id`, `scmtools_repository`.`name`, `scmtools_repository`.`path`, `scmtools_repository`.`mirror_path`, `scmtools_repository`.`raw_file_url`, `scmtools_repository`.`username`, `scmtools_repository`.`password`, `scmtools_repository`.`extra_data`, `scmtools_repository`.`tool_id`, `scmtools_repository`.`hosting_account_id`, `scmtools_repository`.`bug_tracker`, `scmtools_repository`.`encoding`, `scmtools_repository`.`visible`, `scmtools_repository`.`local_site_id`, `scmtools_repository`.`public` FROM `scmtools_repository`

                   63 Query     SELECT `diffviewer_filediffdata`.`binary_hash`, `diffviewer_filediffdata`.`binary` FROM `diffviewer_filediffdata`

What operating system are you using? What browser?
any

Please provide any additional information below.
Is there an easy way to analyse which django command leads to select * from diffviewer_filediffdata ?

Do you know, if there is still same behavior in rvb 2.0.x ?

To prevent uploading huge diff to server, I would propose to define 2 thresholds that could be customizable in .reviewboardrc and/or in repository configuration on server side, for example:
WARN_DIFF_SIZE/LINE => prompt user diff exceed maximum size recommended for optimal reviews
ERROR_DIFF_SIZE/LINE => prompt user diff exceed maximum size allowed to be uploaded
#1 nan****@gmai***** (Google Code) (Is this you? Claim this profile.)
reply to myself. For server settings, there is already diffviewer_max_diff_size. So I think we must define it now
for warning threshold, it can still be useful
david
#2 david
  • -reviewboard
    +rbtools
david
#3 david
  • +BetterErrors
david
#4 david
  • -New
    +Fixed