Skip to content

Conversation

@ksy36
Copy link
Collaborator

@ksy36 ksy36 commented Aug 9, 2023

Checklist for reviewer:

  • Commits should reference a bug or github issue, if relevant (if a bug is
    referenced, the pull request should include the bug number in the title)
  • Scan the PR and verify that no changes (particularly to
    .circleci/config.yml) will cause environment variables (particularly
    credentials) to be exposed in test logs
  • Ensure the container image will be using permissions granted to
    telemetry-airflow
    responsibly.

@ksy36 ksy36 changed the title Fixes #135 - Add a job to fetch Webcompat Knowldegebase related bugs … Fixes #135 - Add a job to fetch Webcompat Knowldegebase related bugs and store them in BQ Aug 9, 2023
@ksy36 ksy36 changed the title Fixes #135 - Add a job to fetch Webcompat Knowldegebase related bugs and store them in BQ Fixes #135 - Add a job to fetch Webcompat Knowldege Base related bugs and store them in BQ Aug 9, 2023
@scholtzan scholtzan self-requested a review August 9, 2023 19:09
@ksy36
Copy link
Collaborator Author

ksy36 commented Aug 9, 2023

Tagging @denschub @jgraham as well r?

Copy link
Contributor

@scholtzan scholtzan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me. someone should verify that the bugzilla import logic is doing what it's supposed to though.

CI is currently not running since the PR is coming from a fork. Once someone has reviewed the logic I can trigger the CI manually to get it to pass.

After this is merged and a docker image gets pushed, you can schedule running the docker container in telemetry-airflow. There are already some examples which you could more or less just copy-paste, like this one: https://github.com/mozilla/telemetry-airflow/blob/main/dags/search_term_data_validation.py

@ksy36
Copy link
Collaborator Author

ksy36 commented Aug 10, 2023

Thanks, Anna! I got some feedback during our sync meeting regarding bugzilla logic and will work on updating the code shortly.

@ksy36
Copy link
Collaborator Author

ksy36 commented Aug 10, 2023

The changes we discussed are as follows:

  1. Fetch updates for standard positions, discussions, etc. from core bugs and (if they’re missing in those of a kb bug) store them in relation tables as url -> kb bug id.
  2. Store priority and severity as integers (and change “normal” etc. strings to equivalent of 1,2,3). If it’s not set or N/A, keep it empty.
  3. Create 2 columns for status and resolution (instead of 1 state) and save the actual values of bug status and resolution from bugzilla.
  4. Get urls from user_story field from breakage bugs and store it in the url_patterns relation table as url -> breakage bug (+ rename the field from knowledge_base_bug to just bug)

Copy link
Member

@denschub denschub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good! And the data that's already there also looks perfect. We do probably need to iterate a bit over time, but this is a really really good start.

RUN groupadd --gid ${USER_ID} ${GROUP_ID} && \
useradd --create-home --uid ${USER_ID} --gid ${GROUP_ID} --home-dir ${HOME} ${GROUP_ID}

WORKDIR ${HOME}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only a minor nit, but it's generally a good idea to reduce the number of layers for container images. While it doesn't matter much here as the images don't get pushed to a registry, it's probably not a bad idea to take care of that.

You can reduce the images quite a bit by changing the user first, then copy'ing all the files while doing a chown at the same time, and then combining all the RUNs together:

WORKDIR ${HOME}
USER ${USER_ID}

COPY --chown=${USER_ID}:${GROUP_ID} . .

RUN pip install --upgrade pip && \
    pip install -r requirements.txt && \
    pip install .

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't modified this file since it was generated by create_job script based on a template that every job in this repository follows, so it's probably fine to leave it as is.

…d bugs from bugzilla and store them in BQ
@ksy36
Copy link
Collaborator Author

ksy36 commented Aug 17, 2023

Thanks for the review @denschub!

CI is currently not running since the PR is coming from a fork. Once someone has reviewed the logic I can trigger the CI manually to get it to pass.

@scholtzan would you be able to trigger CI, please?

@scholtzan
Copy link
Contributor

Looks like flake isn't quite happy, yet.

@ksy36
Copy link
Collaborator Author

ksy36 commented Aug 17, 2023

Oh thanks. I've just pushed a change that sets a version of flake8 in requirements.txt (same issue as in #68), so it should pass now.

@scholtzan scholtzan merged commit e11acc9 into mozilla:main Aug 17, 2023
@ksy36 ksy36 deleted the issues/135/1 branch August 18, 2023 15:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants