Bug 564569 - Staging View performance issue updating UI with thousands of changed files and hundreds of projects
Summary: Staging View performance issue updating UI with thousands of changed files an...
Status: NEW
Alias: None
Product: EGit
Classification: Technology
Component: UI (show other bugs)
Version: 5.8   Edit
Hardware: All All
: P3 normal (vote)
Target Milestone: ---   Edit
Assignee: Project Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-06-23 04:37 EDT by Michael Haubenwallner CLA
Modified: 2021-02-06 14:26 EST (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Haubenwallner CLA 2020-06-23 04:37:37 EDT
Split out in reply to Thomas Wolf from bug 535681 comment #5:

(In reply to Michael Haubenwallner from bug 535681 comment #4)
> Created attachment 283378 [details]
> JProfiler screenshot while Staging View is updating tree view with 9000 items
> 
> Similar issue here:
> Staging View does contribute more than 20 seconds to the Eclipse UI lock,
> when the input for the Staging View is getting updated (but not necessarily
> changed), for example by saving some additional change to some java file.
> 
> Here, the git repository in question does consist of ~67000 files
> distributed to ~630 projects, with ~9000 files currently having Unstaged
> Changes.
> 
> It turns out that the Staging View does initialize text and image to each of
> the ~9000 items into the TreeView, even if most of them never become
> actually visible.
> In total, the most expensive part here is in StagingEntry.getFile(), because
> for each of the ~9000 files, FileSystemResourceManager.resourceForLocation()
> does iterate over all ~630 projects to find some best matching project.
> 
> Although I'm pretty new to Java and Eclipse, to me it should be possible to
> reduce the tree update time in general by using "virtual" tree views, and I
> have drafted something to see whether this may help:
> https://git.eclipse.org/r/c/165047
> 
> What do you think?

You've picked a hard problem. Some thoughts:

* The fact that this takes a lot of time is actually unrelated to the issue of
  this bug 535681. (Though of course that it happens even when view is not visible
  isn't nice.)
* Virtual trees may come with their own set of problems and bugs. The
  CommitFileDiffViewer (for instance bottom right in the history view) uses a
  virtual table, and we struggled quite a bit to make it perform for huge
  commits.
* The selection handling when files are staged/unstaged is highly complicated,
  and may mean that all tree nodes get realized anyway.
* To get decoration overhead out of the main thread: can these viewers be made
  to use a lightweight background decorator for all their decorations, or at
  least for the problem decoration?
* (Small optimization) StagingEntry.getProblemSeverity() could return
  Severity.NONE right away if the state is REMOVED, MISSING, or
  MISSING_AND_CHANGED?
* To reduce the number of calls to resourceForLocation() is there some way that
  we could cache project paths as we go, and subsequently look up projects in
  that cache first? If a project location is a prefix of the file location,
  the file should be in that project, and then determining the file via
  IProject.getFile() and checking if it exists may be faster.
* According to your screenshot, the ProblemLabelDecorator accounts for some 30% 
  of the time. Creating the tree items themselves also appears to be expensive,
  plus setting their images and texts (together another 30%). So perhaps a
  virtual tree is the way to go, but don't expect it to be easy.
* Careful testing on all three platforms (Windows, GTK, OS X) will be needed.
Comment 1 Thomas Wolf CLA 2020-06-23 08:24:17 EDT
One can indeed quite easily reduce the number of calls to FileSystemResourceManager.resourceForLocation() by simply creating the hierarchical model in the StagingViewContentProvider also for the flat list view, and then simply asking the containing StagingFolderEntry's IContainer, if it exists, for the file. (That StagingFolderEntry won't be shown, though -- it's a flat list view.) In my tests on OS X and in a Win 10 virtual machine showed indeed a speed improvement for the refresh of some 30%.

Another problem is switching the presentation when a lot of items are selected. This takes quite some time on OS X, and is outright impossible on Windows. Switching from list view to tree view and back with all 9000 files selected causes the refresh on Windows to take forever (at least minutes; I always killed the process). This can be avoided by getting the selection first, clearing it, refreshing, then setting the selection again.

I'll still have to test these changes on Linux, but if good there, I'll push two commits for these two improvements.

That is not to discourage your attempts to use a virtual tree. Even with these improvements, refreshes with 9000 files take too long (about 9-12 seconds in list view, about 2 sec in tree view).
Comment 2 Eclipse Genie CLA 2020-06-23 11:10:05 EDT
New Gerrit change created: https://git.eclipse.org/r/165363
Comment 3 Eclipse Genie CLA 2020-06-23 11:10:18 EDT
New Gerrit change created: https://git.eclipse.org/r/165364
Comment 4 Thomas Wolf CLA 2020-06-23 11:18:40 EDT
(In reply to Thomas Wolf from comment #1)
> Another problem is switching the presentation when a lot of items are
> selected. This takes quite some time on OS X, and is outright impossible on
> Windows. Switching from list view to tree view and back with all 9000 files
> selected causes the refresh on Windows to take forever (at least minutes; I
> always killed the process). This can be avoided by getting the selection
> first, clearing it, refreshing, then setting the selection again.

(In reply to Eclipse Genie from comment #3)
> New Gerrit change created: https://git.eclipse.org/r/165364

Actually, this is bug 519231.
Comment 5 Eclipse Genie CLA 2020-06-24 02:28:10 EDT
Gerrit change https://git.eclipse.org/r/165363 was merged to [master].
Commit: http://git.eclipse.org/c/egit/egit.git/commit/?id=b530c0dad1ed512d9259624498d599e41be1a28c
Comment 6 Michael Haubenwallner CLA 2020-06-24 06:11:09 EDT
(In reply to Thomas Wolf from comment #1)
> One can indeed quite easily reduce the number of calls to
> FileSystemResourceManager.resourceForLocation() 

Actually I'm wondering whether FileSystemResourceManager.resourceForLocation() itself really does have to sequentially iterate over all projects, even if some return value was already found. Feels like a topic for a different Eclipse project though.

> That is not to discourage your attempts to use a virtual tree. Even with
> these improvements, refreshes with 9000 files take too long (about 9-12
> seconds in list view, about 2 sec in tree view).

Never mind: Getting the virtual tree ready may take some time actually, and any improvement available earlier is highly appreciated, thanks!