Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Integration][Gitlab-v2] Added support for file kind #1482

Open
wants to merge 211 commits into
base: main
Choose a base branch
from

Conversation

shariff-6
Copy link
Contributor

@shariff-6 shariff-6 commented Mar 11, 2025

Description

What

  • Added support for syncing file data from GitLab to Port within the GitLab integration.
  • Introduced file content fetching and parsing (JSON/YAML) in the GitLab client.
  • Enabled file search across repositories and groups using GitLab’s advanced search syntax with scope and query.

Why

Enable users to sync and manage file-based data from GitLab in Port.

How

  • Client Updates (gitlab_client.py):

    • Added get_file_content and file_exists for retrieving and checking file existence.
    • Introduced search_files to search for files using scope (e.g., "blobs") and query, supporting GitLab’s advanced search syntax across repositories and groups.
  • Parsing Logic (utils.py):

    • Added parse_file_content to parse JSON and YAML file content into structured data.
  • Entity Processing:

    • Extended processing in file_entity_processor.py to handle file:// references by fetching and parsing file content.

Integration Updates:

  • GitLabFileProcessor:

    • Added to process file:// references in entity mappings, fetching content
    • Added to process search:// references in entity mappings, checking file existence.
  • GitLabFilesResourceConfig (integration.py):

    • Introduced query field to configure file searches using GitLab’s advanced search syntax.
  • Integration and Main Updates (main.py):

    • Added on_resync_files handler to sync files using scope="blobs" and user-defined queries.

Utilities and Improvements:

  • Utility Functions (utils.py):
    • Included parse_file_content for JSON/YAML parsing.
    • Assumed get_file_paths in gitops/utils.py to extract file paths from entity mappings (if applicable).

Notes for PR Reviewers

  • Feature Focus: This PR enables file syncing with query-based search (e.g., "filename:config.yaml", "test") instead of pattern matching.
  • Scope: Currently uses scope="blobs" for file searches; future enhancements could expand to other scopes.
  • Config: Users must update file kind configs to use query instead of glob patterns for file searches in gitlab-v2

Type of change

Please leave one option from the following and delete the rest:

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • New Integration (non-breaking change which adds a new integration)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Non-breaking change (fix of existing functionality that will not change current behavior)
  • Documentation (added/updated documentation)

All tests should be run against the port production environment(using a testing org).

Core testing checklist

  • Integration able to create all default resources from scratch
  • Resync finishes successfully
  • Resync able to create entities
  • Resync able to update entities
  • Resync able to detect and delete entities
  • Scheduled resync able to abort existing resync and start a new one
  • Tested with at least 2 integrations from scratch
  • Tested with Kafka and Polling event listeners
  • Tested deletion of entities that don't pass the selector

Integration testing checklist

  • Integration able to create all default resources from scratch
  • Resync able to create entities
  • Resync able to update entities
  • Resync able to detect and delete entities
  • Resync finishes successfully
  • If new resource kind is added or updated in the integration, add example raw data, mapping and expected result to the examples folder in the integration directory.
  • If resource kind is updated, run the integration with the example data and check if the expected result is achieved
  • If new resource kind is added or updated, validate that live-events for that resource are working as expected
  • Docs PR link here

Preflight checklist

  • Handled rate limiting
  • Handled pagination
  • Implemented the code in async
  • Support Multi account

Screenshots

PackagesResource

API Documentation

Provide links to the API documentation used for this integration.

- Define `ObjectKind` enum for supported object types
- Implement `create_gitlab_client` function to create GitLab client instance
- Define `on_start` event handler for application setup
- Implement `setup_application` function for webhook setup (placeholder)
- Define `on_resync_projects` event handler for syncing GitLab projects to Port
- Create `GraphQLClient` class to execute GraphQL queries
- Implement `execute_query` method to send queries with variables
- Handle errors and raise exceptions for failed queries
- Create `GitLabClient` class to encapsulate GraphQL and REST clients
- Implement `get_projects` method to fetch projects with pagination using GraphQL
- Create `Fragments` class with reusable `PROJECT_FIELDS` fragment
- Create `ProjectQueries` class with `LIST` query to fetch projects with pagination
- Create `RestClient` class to send REST API requests
- Implement `send_api_request` method for single requests
- Implement `make_paginated_request` method for paginated requests
- Handle errors and raise exceptions for failed requests
- Create `AuthClient` class to manage authentication token
- Implement `get_headers` method to return headers with bearer token
- Update .env.example with new environment variables for event listener, base URL, token mapping, integration identifier, and integration type
- Update blueprints.json with Service blueprint for GitLab projects, including properties for URL, README, description, namespace, default branch, languages, and labels
- Update port-app-config.yml with project resource mapping to match Service blueprint properties and map project fields to corresponding GitLab project fields
- Update spec.yaml with integration details, exported resource kind, and configurations for GitLab token, app host, and GitLab host
- Implement client singleton pattern
- Add support for additional entity types
- Fix URL handling with trailing slash removal
- Add get_groups() method using GraphQL API
- Add get_issues() method using REST API
- Add get_merge_requests() method using REST API
- Replace GraphQL-based group fetching with REST API implementation
- Add min_access_level and all_available parameters to group requests
- Add new get_group_resource method to fetch resources for a specific group
- Remove unused code for issues and merge requests endpoints
- Remove 'all_available' parameter from groups request
- Add debug logging for REST API requests
- Add get_group_resource method to fetch issues and merge requests by group
- Add validation for supported resource types
- Set default state parameter for group resources to 'all'
- Improve error handling and logging for REST requests
- Update typing annotations for better code clarity
- Replace direct auth_client usage with pre-fetched headers
- Improve code formatting and reduce method complexity
- Simplify header retrieval in REST and GraphQL clients
Base automatically changed from PORT-13110-Initialise-New-Gitlab-Integration to main March 25, 2025 18:17
@github-actions github-actions bot added size/XXL and removed size/L labels Mar 25, 2025
@github-actions github-actions bot added size/L and removed size/XXL labels Mar 26, 2025
Copy link
Contributor

qodo-merge-pro bot commented Mar 26, 2025

CI Feedback 🧐

(Feedback updated until commit 96c876f)

A test triggered by this PR failed. Here is an AI-generated analysis of the failure:

Action: 🚢 gitlab-v2

Failed stage: Lint [❌]

Failure summary:

The action failed because the mypy type checking found errors in the gitlab_client.py file:

1. Line 211: Incompatible types in "yield" statement - actual type is "dict[str, Any]" but expected
type is "list[dict[str, Any]]"
2. Line 229: Same error - incompatible types in "yield" statement
with actual type "dict[str, Any]" vs expected "list[dict[str, Any]]"

The make lint command failed with exit code 1 due to these type errors.

Relevant error logs:
1:  ##[group]Operating System
2:  Ubuntu
...

968:  Using cached SecretStorage-3.3.3-py3-none-any.whl (15 kB)
969:  Using cached urllib3-2.3.0-py3-none-any.whl (128 kB)
970:  Using cached jaraco.classes-3.4.0-py3-none-any.whl (6.8 kB)
971:  Using cached cryptography-44.0.2-cp39-abi3-manylinux_2_34_x86_64.whl (4.2 MB)
972:  Using cached more_itertools-10.6.0-py3-none-any.whl (63 kB)
973:  Using cached cffi-1.17.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (479 kB)
974:  Using cached pycparser-2.22-py3-none-any.whl (117 kB)
975:  Installing collected packages: trove-classifiers, ptyprocess, fastjsonschema, distlib, urllib3, tomlkit, shellingham, rapidfuzz, pyproject-hooks, pycparser, poetry-core, platformdirs, pkginfo, pexpect, packaging, msgpack, more-itertools, jeepney, installer, idna, filelock, crashtest, charset-normalizer, certifi, virtualenv, requests, jaraco.classes, dulwich, cleo, cffi, build, requests-toolbelt, cryptography, cachecontrol, SecretStorage, keyring, poetry-plugin-export, poetry
976:  Successfully installed SecretStorage-3.3.3 build-1.2.2.post1 cachecontrol-0.14.2 certifi-2025.1.31 cffi-1.17.1 charset-normalizer-3.4.1 cleo-2.1.0 crashtest-0.4.1 cryptography-44.0.2 distlib-0.3.9 dulwich-0.21.7 fastjsonschema-2.21.1 filelock-3.18.0 idna-3.10 installer-0.7.0 jaraco.classes-3.4.0 jeepney-0.9.0 keyring-24.3.1 more-itertools-10.6.0 msgpack-1.1.0 packaging-24.2 pexpect-4.9.0 pkginfo-1.12.1.2 platformdirs-4.3.7 poetry-1.8.5 poetry-core-1.9.1 poetry-plugin-export-1.8.0 ptyprocess-0.7.0 pycparser-2.22 pyproject-hooks-1.2.0 rapidfuzz-3.12.2 requests-2.32.3 requests-toolbelt-1.0.0 shellingham-1.5.4 tomlkit-0.13.2 trove-classifiers-2025.3.19.19 urllib3-2.3.0 virtualenv-20.30.0
977:  Installing dependencies from lock file
978:  No dependencies to install or update
979:  Installing the current project: gitlab-v2 (0.1.1-dev)
980:  Warning: The current project could not be installed: No file/folder found for package gitlab-v2
981:  If you do not want to install the current project use --no-root.
982:  If you want to use Poetry only for dependency management but not for packaging, you can disable package mode by setting package-mode = false in your pyproject.toml file.
983:  In a future version of Poetry this warning will become an error!
984:  ##[group]Run make lint
985:  �[36;1mmake lint�[0m
986:  shell: /usr/bin/bash -e {0}
987:  env:
988:  pythonLocation: /opt/hostedtoolcache/Python/3.12.9/x64
989:  PKG_CONFIG_PATH: /opt/hostedtoolcache/Python/3.12.9/x64/lib/pkgconfig
990:  Python_ROOT_DIR: /opt/hostedtoolcache/Python/3.12.9/x64
991:  Python2_ROOT_DIR: /opt/hostedtoolcache/Python/3.12.9/x64
992:  Python3_ROOT_DIR: /opt/hostedtoolcache/Python/3.12.9/x64
993:  LD_LIBRARY_PATH: /opt/hostedtoolcache/Python/3.12.9/x64/lib
994:  ##[endgroup]
995:  Running poetry check
996:  All set!
997:  Running mypy
998:  clients/gitlab_client.py:211: error: Incompatible types in "yield" (actual type "dict[str, Any]", expected type "list[dict[str, Any]]")  [misc]
999:  clients/gitlab_client.py:229: error: Incompatible types in "yield" (actual type "dict[str, Any]", expected type "list[dict[str, Any]]")  [misc]
1000:  Found 2 errors in 1 file (checked 20 source files)
1001:  Running ruff
1002:  warning: The top-level linter settings are deprecated in favour of their counterparts in the `lint` section. Please update the following options in `pyproject.toml`:
1003:  - 'ignore' -> 'lint.ignore'
1004:  All checks passed!
1005:  Running black
1006:  All done! ✨ 🍰 ✨
1007:  20 files would be left unchanged.
1008:  Running yamllint
1009:  �[0;31mOne or more checks failed with exit code 1�[0m
1010:  make: *** [../_infra/Makefile:62: lint] Error 1
1011:  ##[error]Process completed with exit code 2.
1012:  Post job cleanup.

… helper (enrichment, group, file/search) sections.

- Organized RestClient methods by pagination, project, file, and helper categories.
- No functional changes, only reordered for better readability.
…pository

This commit refactors the GitLabClient.search_files method to use scope and query parameters instead of path patterns, aligning it with GitLab's advanced search syntax. The changes include:

- Updated search_files to take scope and query, delegating repo-specific searches to a refactored _search_in_repository.
- Modified _search_in_repository to use scope and query, removing pattern-based logic and dependencies (fnmatch, convert_glob_to_gitlab_patterns).
- Renamed FilesSelector.path to query in integration.py with a description supporting GitLab's advanced search syntax.
- Adjusted on_resync_files in main.py to hardcode scope="blobs" and use the new search_files signature.
Copy link
Member

@mk-armah mk-armah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose absolute import over relative imports.

…ileProcessor` to the new `processors` module.

- Added rate limiting to `GitLabFileProcessor` with a maximum of 20 requests per second using `aiolimiter`.
- Refactored file and search handling logic for GitLab file content and existence checks.
- Added a utility function to parse search strings for GitLab queries.
- Updated dependencies in `pyproject.toml` to include `aiolimiter`.
- Adjusted imports in `integration.py` to reflect new file structure.
- Fixed type casting for project_id in _process_file method
- Removed redundant declaration of PARSEABLE_EXTENSIONS in _process_batch method
- Updated CHANGELOG with new release date (2025-04-02) and changed section from "Improvements" to "Features"
… results.

- Update _search_in_repository and _search_in_group to use batches directly.
- Move GitLabFileProcessor._search setup inside if/elif for efficiency.
- Use self._rate_limiter instead of class name.
@shariff-6 shariff-6 requested a review from mk-armah April 2, 2025 18:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants