Loading...

XML

Word

Printable

Type: Enhancement
Resolution: Fixed
Priority: P3
Fix Version/s: 1.0
Affects Version/s: None
Component/s: bots
Labels:
None

One of the biggest culprits for bot performance, at least for bots interacting with GitLab, is that we aren't currently trying to just look at MRs that have changed since last we checked. We are instead always fetching all open MRs and always spawning WorkItems for all of them. This certainly doesn't scale well.

The reason we do this can be found in a comment here regarding the "updated_at" field in MRs:
https://github.com/openjdk/skara/blob/c7fc108a271580fa7e6afecf290d5ba0fafeba05/forge/src/main/java/org/openjdk/skara/forge/PullRequestUpdateCache.java#L36

"GitLab CE does not update this field on events such as adding an award"

I started to investigate this, but my findings aren't matching what this comment claims. According to comments in this gitlab issue https://gitlab.com/gitlab-org/gitlab/-/issues/21312, it's actually worse. It basically says that "updated_at" fields are capped to update at most once per minute. In my own testing, it also seems like updates that never touched the timestamp will never be reflected in the timestamp. This makes it pretty complex to reliably query for only updated MRs. It's still possible though, with some workarounds, and I would like to implement something that solves this.

For Github, it seems like we can trust the updated_at field. This means a simple algorithm like this should work fine: First round, query for all PRs. Save the highest updated_at timestamp found in the result for next round. After that, query for all PRs with a higher updated_at than the one found in the previous round.

For GitLab, we have two complications. The first is that the PR could have had an update up to one minute after the current updated_at value. To mitigate this, we can simply change the query to get all PRs with updated_at higher than 1 minute before the highest one found in the previous round. This will guarantee that we aren't missing any PRs. This may be good enough, and would still be a big improvement, but it would be even better if we could avoid basically guaranteed false positives. To avoid false positives, we can save the PR objects from the last round and compare them to the new query result. We only need to consider the PR updated if there is a difference.

The second complication is that not all PR state that we care about is in the MR query result. Typically reviews (approvals) and comments (notes) need to be queried separately. This means that to be able to dismiss a PR as a false positive, we also need to fetch the recently updated reviews and comments and compare them for differences. At least with ~~SKARA-1555~~, we don't need to also fetch award emojis.

We could potentially reduce the need for false positives checks by keeping track of a bot local timestamp in parallel with each last updated_at from the query results. The local timestamp, together with a boolean flag, could then be used to track when enough time has passed so that we don't need to check for false positives.

I intend to implement this logic in a common class in the forge module, so that all bots that need to subscribe to PR updates can just simply use this with a single call like 'getUpdatedPullRequests()'. By always querying for updated PRs, there will no longer be a need to periodically check closed PRs instead of just OPEN. Some level of customization will be available as some bots need to consider closed PRs and others don't.

blocks

SKARA-1590 Convert PullRequestBot to use new poller

Resolved

SKARA-1594 Convert rest of bots polling PRs and Issues to use new pollers

Resolved

SKARA-1572 Improve TestInfoBot scheduling

Closed

is blocked by

SKARA-1589 Improve TestIssue and TestPullRequest

Resolved

relates to

SKARA-1595 Introduce an IssuePoller similar to the PullRequestPoller

Resolved

SKARA-1555 Remove support for review by award emojis in GitLab

Resolved

links to

Commit openjdk/skara/ca8d09a2

Review openjdk/skara/1369

(1 relates to, 2 links to)

Assignee:: Erik Joelsson

Reporter:: Erik Joelsson

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2022-08-25 12:22

Updated:: 2022-09-20 16:40

Resolved:: 2022-09-20 16:40

Details

Description

Attachments

Issue Links

Activity

People

Dates