Skip to content

Conversation

@kuba--
Copy link
Contributor

@kuba-- kuba-- commented Sep 10, 2018

This PR closes #440

Changes:

  • Updated go-git
  • repository interface requires Cache() cache.Object
  • one cache object per RepositoryPool

kuba-- added 2 commits September 10, 2018 10:10
Signed-off-by: kuba-- <[email protected]>
@kuba-- kuba-- added proposal proposal for new additions or changes performance Performance improvements labels Sep 10, 2018
@kuba-- kuba-- requested review from a team and jfontan September 10, 2018 11:25
Password string `short:"P" long:"password" default:"" description:"Password used for connection."`
PilosaURL string `long:"pilosa" default:"http://localhost:10101" description:"URL to your pilosa server." env:"PILOSA_ENDPOINT"`
IndexDir string `short:"i" long:"index" default:"/var/lib/gitbase/index" description:"Directory where the gitbase indexes information will be persisted." env:"GITBASE_INDEX_DIR"`
CacheSize cache.FileSize `long:"cache" default:"536870912" description:"Object cache size" env:"GITBASE_CACHE_SIZE"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you change the default to something more human-friendly? like 100 and allows only MB

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah sure, how about 512MB?

func NewRepositoryPool(maxCacheSize cache.FileSize) *RepositoryPool {
return &RepositoryPool{
repositories: make(map[string]repository),
cache: cache.NewObjectLRU(maxCacheSize),
Copy link
Contributor

@jfontan jfontan Sep 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly we are using the same cache for all repositories. While this may be OK now we may have problems when partitions are used:

  • I'm not really sure of the cache implementation thread safety
  • Recent objects from one repo may be evicted by others and make it run slower
  • Concurrent use of the same cache may be slower (locking?)

I would have one cache per repo, a pool of caches that can be evicted or one cache per partition. I think we can have one per repo for now and get back to it when partitions are in use.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the problem of having one cache per repo is that we will not able to know the amount of memory gitbase will use. Maybe we can use as a default value 96 MiB * number of repositories. Or apart from that, maybe we can implement an LRU with two key layers, to evict keys from a specific repository. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So maybe we can keep one cache in a pool but add mutex and prefix keys by repo id?

// AddGitWithID adds a git repository to the pool. ID should be specified.
func (p *RepositoryPool) AddGitWithID(id, path string) error {
return p.Add(gitRepo(id, path))
return p.Add(gitRepo(id, path, p.cache))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't store the cache in the git/sivaRepo. If they are different instances (one per repo) it can use lots of memory.

@kuba--
Copy link
Contributor Author

kuba-- commented Sep 14, 2018

@jfontan - rebased. Lets go with the simplest approach, so far.

Copy link
Contributor

@jfontan jfontan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go with this approach and iterate over it. I agree.

@ajnavarro ajnavarro merged commit 1bb9a66 into src-d:master Sep 18, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Performance improvements proposal proposal for new additions or changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Increase go-git cache size

4 participants