Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.11.0 #46

Merged
merged 15 commits into from
Jan 14, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
2 changes: 1 addition & 1 deletion .run/dqo run.run.xml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
<option name="region" />
<option name="useCurrentConnection" value="false" />
</extension>
<option name="JAR_PATH" value="$PROJECT_DIR$/dqops/target/dqo-dqops-1.10.1.jar" />
<option name="JAR_PATH" value="$PROJECT_DIR$/dqops/target/dqo-dqops-1.11.0.jar" />
<option name="VM_PARAMETERS" value="-XX:MaxRAMPercentage=60.0 --add-opens java.base/java.nio=ALL-UNNAMED --add-opens java.base/java.util.concurrent=ALL-UNNAMED" />
<option name="PROGRAM_PARAMETERS" value="--server.port=8888 --dqo.python.debug-mode=silent" />
<option name="WORKING_DIRECTORY" value="$PROJECT_DIR$" />
Expand Down
8 changes: 4 additions & 4 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# 1.10.1

* Small UI fixes to open pages directly from an URL.
* Fix problems when installing on Windows using pip, when Python was installed from Windows Store and uses a deeply nested folder structure
# 1.11.0

* Fixing problems when importing files from a non-existing folder
* Upgrade DuckDB to 1.1.3
* Support Avro files
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
1.10.1
1.11.0
2 changes: 1 addition & 1 deletion distribution/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@

<groupId>com.dqops</groupId>
<artifactId>dqo-distribution</artifactId>
<version>1.10.1</version> <!-- DQOps Version, do not touch (changed automatically) -->
<version>1.11.0</version> <!-- DQOps Version, do not touch (changed automatically) -->
<name>dqo-distribution</name>
<description>DQOps Data Quality Operations Center final assembly</description>
<packaging>pom</packaging>
Expand Down
2 changes: 2 additions & 0 deletions distribution/python/dqops/client/models/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@
)
from .authenticated_dashboard_model import AuthenticatedDashboardModel
from .auto_import_tables_spec import AutoImportTablesSpec
from .avro_file_format_spec import AvroFileFormatSpec
from .aws_authentication_mode import AwsAuthenticationMode
from .azure_authentication_mode import AzureAuthenticationMode
from .between_floats_rule_parameters_spec import BetweenFloatsRuleParametersSpec
Expand Down Expand Up @@ -2154,6 +2155,7 @@
"AnomalyTimelinessDelayRuleWarning1PctParametersSpec",
"AuthenticatedDashboardModel",
"AutoImportTablesSpec",
"AvroFileFormatSpec",
"AwsAuthenticationMode",
"AzureAuthenticationMode",
"BetweenFloatsRuleParametersSpec",
Expand Down
58 changes: 58 additions & 0 deletions distribution/python/dqops/client/models/avro_file_format_spec.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
from typing import Any, Dict, List, Type, TypeVar, Union

from attrs import define as _attrs_define
from attrs import field as _attrs_field

from ..types import UNSET, Unset

T = TypeVar("T", bound="AvroFileFormatSpec")


@_attrs_define
class AvroFileFormatSpec:
"""
Attributes:
filename (Union[Unset, bool]): Whether or not an extra filename column should be included in the result.
"""

filename: Union[Unset, bool] = UNSET
additional_properties: Dict[str, Any] = _attrs_field(init=False, factory=dict)

def to_dict(self) -> Dict[str, Any]:
filename = self.filename

field_dict: Dict[str, Any] = {}
field_dict.update(self.additional_properties)
field_dict.update({})
if filename is not UNSET:
field_dict["filename"] = filename

return field_dict

@classmethod
def from_dict(cls: Type[T], src_dict: Dict[str, Any]) -> T:
d = src_dict.copy()
filename = d.pop("filename", UNSET)

avro_file_format_spec = cls(
filename=filename,
)

avro_file_format_spec.additional_properties = d
return avro_file_format_spec

@property
def additional_keys(self) -> List[str]:
return list(self.additional_properties.keys())

def __getitem__(self, key: str) -> Any:
return self.additional_properties[key]

def __setitem__(self, key: str, value: Any) -> None:
self.additional_properties[key] = value

def __delitem__(self, key: str) -> None:
del self.additional_properties[key]

def __contains__(self, key: str) -> bool:
return key in self.additional_properties
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,7 @@ class DqoUserProfileModel:
must be configured correctly and the user must have at least an EDITOR role.
can_use_ai_anomaly_detection (Union[Unset, bool]): The DQOps instance is a paid version with advanced AI anomaly
prediction.
can_logout (Union[Unset, bool]): This instance uses federated authentication and the user can log out.
"""

user: Union[Unset, str] = UNSET
Expand Down Expand Up @@ -98,6 +99,7 @@ class DqoUserProfileModel:
can_use_data_domains: Union[Unset, bool] = UNSET
can_synchronize_to_data_catalog: Union[Unset, bool] = UNSET
can_use_ai_anomaly_detection: Union[Unset, bool] = UNSET
can_logout: Union[Unset, bool] = UNSET
additional_properties: Dict[str, Any] = _attrs_field(init=False, factory=dict)

def to_dict(self) -> Dict[str, Any]:
Expand Down Expand Up @@ -138,6 +140,7 @@ def to_dict(self) -> Dict[str, Any]:
can_use_data_domains = self.can_use_data_domains
can_synchronize_to_data_catalog = self.can_synchronize_to_data_catalog
can_use_ai_anomaly_detection = self.can_use_ai_anomaly_detection
can_logout = self.can_logout

field_dict: Dict[str, Any] = {}
field_dict.update(self.additional_properties)
Expand Down Expand Up @@ -212,6 +215,8 @@ def to_dict(self) -> Dict[str, Any]:
)
if can_use_ai_anomaly_detection is not UNSET:
field_dict["can_use_ai_anomaly_detection"] = can_use_ai_anomaly_detection
if can_logout is not UNSET:
field_dict["can_logout"] = can_logout

return field_dict

Expand Down Expand Up @@ -293,6 +298,8 @@ def from_dict(cls: Type[T], src_dict: Dict[str, Any]) -> T:

can_use_ai_anomaly_detection = d.pop("can_use_ai_anomaly_detection", UNSET)

can_logout = d.pop("can_logout", UNSET)

dqo_user_profile_model = cls(
user=user,
tenant=tenant,
Expand Down Expand Up @@ -326,6 +333,7 @@ def from_dict(cls: Type[T], src_dict: Dict[str, Any]) -> T:
can_use_data_domains=can_use_data_domains,
can_synchronize_to_data_catalog=can_synchronize_to_data_catalog,
can_use_ai_anomaly_detection=can_use_ai_anomaly_detection,
can_logout=can_logout,
)

dqo_user_profile_model.additional_properties = d
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@


class DuckdbFilesFormatType(str, Enum):
AVRO = "avro"
CSV = "csv"
DELTA_LAKE = "delta_lake"
ICEBERG = "iceberg"
Expand Down
18 changes: 18 additions & 0 deletions distribution/python/dqops/client/models/duckdb_parameters_spec.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
from ..types import UNSET, Unset

if TYPE_CHECKING:
from ..models.avro_file_format_spec import AvroFileFormatSpec
from ..models.csv_file_format_spec import CsvFileFormatSpec
from ..models.delta_lake_file_format_spec import DeltaLakeFileFormatSpec
from ..models.duckdb_parameters_spec_directories import (
Expand Down Expand Up @@ -40,6 +41,7 @@ class DuckdbParametersSpec:
csv (Union[Unset, CsvFileFormatSpec]):
json (Union[Unset, JsonFileFormatSpec]):
parquet (Union[Unset, ParquetFileFormatSpec]):
avro (Union[Unset, AvroFileFormatSpec]):
iceberg (Union[Unset, IcebergFileFormatSpec]):
delta_lake (Union[Unset, DeltaLakeFileFormatSpec]):
directories (Union[Unset, DuckdbParametersSpecDirectories]): Virtual schema name to directory mappings. The path
Expand Down Expand Up @@ -70,6 +72,7 @@ class DuckdbParametersSpec:
csv: Union[Unset, "CsvFileFormatSpec"] = UNSET
json: Union[Unset, "JsonFileFormatSpec"] = UNSET
parquet: Union[Unset, "ParquetFileFormatSpec"] = UNSET
avro: Union[Unset, "AvroFileFormatSpec"] = UNSET
iceberg: Union[Unset, "IcebergFileFormatSpec"] = UNSET
delta_lake: Union[Unset, "DeltaLakeFileFormatSpec"] = UNSET
directories: Union[Unset, "DuckdbParametersSpecDirectories"] = UNSET
Expand Down Expand Up @@ -111,6 +114,10 @@ def to_dict(self) -> Dict[str, Any]:
if not isinstance(self.parquet, Unset):
parquet = self.parquet.to_dict()

avro: Union[Unset, Dict[str, Any]] = UNSET
if not isinstance(self.avro, Unset):
avro = self.avro.to_dict()

iceberg: Union[Unset, Dict[str, Any]] = UNSET
if not isinstance(self.iceberg, Unset):
iceberg = self.iceberg.to_dict()
Expand Down Expand Up @@ -160,6 +167,8 @@ def to_dict(self) -> Dict[str, Any]:
field_dict["json"] = json
if parquet is not UNSET:
field_dict["parquet"] = parquet
if avro is not UNSET:
field_dict["avro"] = avro
if iceberg is not UNSET:
field_dict["iceberg"] = iceberg
if delta_lake is not UNSET:
Expand Down Expand Up @@ -191,6 +200,7 @@ def to_dict(self) -> Dict[str, Any]:

@classmethod
def from_dict(cls: Type[T], src_dict: Dict[str, Any]) -> T:
from ..models.avro_file_format_spec import AvroFileFormatSpec
from ..models.csv_file_format_spec import CsvFileFormatSpec
from ..models.delta_lake_file_format_spec import DeltaLakeFileFormatSpec
from ..models.duckdb_parameters_spec_directories import (
Expand Down Expand Up @@ -248,6 +258,13 @@ def from_dict(cls: Type[T], src_dict: Dict[str, Any]) -> T:
else:
parquet = ParquetFileFormatSpec.from_dict(_parquet)

_avro = d.pop("avro", UNSET)
avro: Union[Unset, AvroFileFormatSpec]
if isinstance(_avro, Unset):
avro = UNSET
else:
avro = AvroFileFormatSpec.from_dict(_avro)

_iceberg = d.pop("iceberg", UNSET)
iceberg: Union[Unset, IcebergFileFormatSpec]
if isinstance(_iceberg, Unset):
Expand Down Expand Up @@ -314,6 +331,7 @@ def from_dict(cls: Type[T], src_dict: Dict[str, Any]) -> T:
csv=csv,
json=json,
parquet=parquet,
avro=avro,
iceberg=iceberg,
delta_lake=delta_lake,
directories=directories,
Expand Down
18 changes: 18 additions & 0 deletions distribution/python/dqops/client/models/file_format_spec.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
from ..types import UNSET, Unset

if TYPE_CHECKING:
from ..models.avro_file_format_spec import AvroFileFormatSpec
from ..models.csv_file_format_spec import CsvFileFormatSpec
from ..models.delta_lake_file_format_spec import DeltaLakeFileFormatSpec
from ..models.iceberg_file_format_spec import IcebergFileFormatSpec
Expand All @@ -23,6 +24,7 @@ class FileFormatSpec:
csv (Union[Unset, CsvFileFormatSpec]):
json (Union[Unset, JsonFileFormatSpec]):
parquet (Union[Unset, ParquetFileFormatSpec]):
avro (Union[Unset, AvroFileFormatSpec]):
iceberg (Union[Unset, IcebergFileFormatSpec]):
delta_lake (Union[Unset, DeltaLakeFileFormatSpec]):
file_paths (Union[Unset, List[str]]): The list of paths to files with data that are used as a source.
Expand All @@ -31,6 +33,7 @@ class FileFormatSpec:
csv: Union[Unset, "CsvFileFormatSpec"] = UNSET
json: Union[Unset, "JsonFileFormatSpec"] = UNSET
parquet: Union[Unset, "ParquetFileFormatSpec"] = UNSET
avro: Union[Unset, "AvroFileFormatSpec"] = UNSET
iceberg: Union[Unset, "IcebergFileFormatSpec"] = UNSET
delta_lake: Union[Unset, "DeltaLakeFileFormatSpec"] = UNSET
file_paths: Union[Unset, List[str]] = UNSET
Expand All @@ -49,6 +52,10 @@ def to_dict(self) -> Dict[str, Any]:
if not isinstance(self.parquet, Unset):
parquet = self.parquet.to_dict()

avro: Union[Unset, Dict[str, Any]] = UNSET
if not isinstance(self.avro, Unset):
avro = self.avro.to_dict()

iceberg: Union[Unset, Dict[str, Any]] = UNSET
if not isinstance(self.iceberg, Unset):
iceberg = self.iceberg.to_dict()
Expand All @@ -70,6 +77,8 @@ def to_dict(self) -> Dict[str, Any]:
field_dict["json"] = json
if parquet is not UNSET:
field_dict["parquet"] = parquet
if avro is not UNSET:
field_dict["avro"] = avro
if iceberg is not UNSET:
field_dict["iceberg"] = iceberg
if delta_lake is not UNSET:
Expand All @@ -81,6 +90,7 @@ def to_dict(self) -> Dict[str, Any]:

@classmethod
def from_dict(cls: Type[T], src_dict: Dict[str, Any]) -> T:
from ..models.avro_file_format_spec import AvroFileFormatSpec
from ..models.csv_file_format_spec import CsvFileFormatSpec
from ..models.delta_lake_file_format_spec import DeltaLakeFileFormatSpec
from ..models.iceberg_file_format_spec import IcebergFileFormatSpec
Expand Down Expand Up @@ -109,6 +119,13 @@ def from_dict(cls: Type[T], src_dict: Dict[str, Any]) -> T:
else:
parquet = ParquetFileFormatSpec.from_dict(_parquet)

_avro = d.pop("avro", UNSET)
avro: Union[Unset, AvroFileFormatSpec]
if isinstance(_avro, Unset):
avro = UNSET
else:
avro = AvroFileFormatSpec.from_dict(_avro)

_iceberg = d.pop("iceberg", UNSET)
iceberg: Union[Unset, IcebergFileFormatSpec]
if isinstance(_iceberg, Unset):
Expand All @@ -129,6 +146,7 @@ def from_dict(cls: Type[T], src_dict: Dict[str, Any]) -> T:
csv=csv,
json=json,
parquet=parquet,
avro=avro,
iceberg=iceberg,
delta_lake=delta_lake,
file_paths=file_paths,
Expand Down
4 changes: 2 additions & 2 deletions distribution/python/dqops/version.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,8 @@
# limit

# WARNING: the next two lines with the version numbers (VERSION =, PIP_VERSION =) should not be modified manually. They are changed by a maven profile at compile time.
VERSION = "1.10.1"
PIP_VERSION = "1.10.1"
VERSION = "1.11.0"
PIP_VERSION = "1.11.0"
GITHUB_RELEASE = "v" + VERSION + ""
JAVA_VERSION = "17"

Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
---
title: How to detect data accuracy issues
title: How to Detect Data Accuracy Issues? Examples and Best Practices
---
# How to detect data accuracy issues
Data accuracy checks in DQOps compare an aggregated value in a tested table to the same aggregated value in a reference table.
# How to Detect Data Accuracy Issues? Examples and Best Practices
Data accuracy checks compare an aggregated value in a tested table to the same aggregated value in a reference table to detect differences.

The accuracy checks in DQOps are configured in the `accuracy` category of data quality checks.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
---
title: How to detect anomalies in numeric data
title: How to Detect Anomalies in Numeric Data? Examples and Best Practices
---
# How to detect anomalies in numeric data
Read this guide to learn how to detect anomaly data quality issues in numeric data using DQOps.
# How to Detect Anomalies in Numeric Data? Examples and Best Practices
Read this guide to learn how to detect data anomalies (outliers) in numeric data that has timestamp columns to identify time-series with historic values.

The data quality checks are configured in the `anomaly` category in DQOps.

## What is an anomaly in data
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: How to detect values not matching patterns
title: How to Detect Values not Matching Patterns? Examples
---
# How to detect values not matching patterns
# How to Detect Values not Matching Patterns? Examples
Read this guide to learn how to validate column values if they match patterns, such as phone numbers, emails, or any regular expression.

The pattern match checks are configured in the `patterns` category in DQOps.
Expand Down
Loading
Loading