-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add feature for Duplicate Key PIR #281
base: main
Are you sure you want to change the base?
Conversation
All contributors have signed the CLA ✍️ ✅ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @yunbo99, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
Summary of Changes
This pull request introduces a new feature for Duplicate Key PIR (DkPir), which enables batch queries while preserving data privacy. The changes include adding necessary dependencies, example data, configuration files, and core logic for both sender and receiver sides. The implementation extends the existing APSI framework to support scenarios where the server-side database contains multiple rows with the same key, enabling applications like billable PIR.
Highlights
- Dependency Addition: The pull request adds
heu
as a bazel dependency inMODULE.bazel
. This library is used for homomorphic encryption, a key component of the DkPir protocol. - Example Data: New CSV files (
duplicate_key_db.csv
,duplicate_key_query.csv
,duplicate_key_target_result.csv
) are added toexamples/pir/apsi/data/
to demonstrate the duplicate key PIR functionality. These files contain sample data and queries for testing the new feature. - Configuration Files: New configuration files (
dk_pir_receiver_online.json
,dk_pir_sender_offline.json
,dk_pir_sender_online.json
) are added toexamples/pir/config/
to define the settings for the DkPir sender and receiver in both online and offline modes. These configurations specify parameters like file paths, curve types, and communication settings. - Core Logic Implementation: The pull request introduces a new directory
psi/algorithm/dkpir/
containing the core logic for the DkPir protocol. This includes implementations for the sender, receiver, OPRF receiver, and common utilities, as well as a new proto definition for secret keys. - CSV Conversion Utilities: The pull request adds a
csv_converter
library topsi/utils
, providing utilities for converting CSV files into formats suitable for APSI and DkPir. This includes functionalities for merging columns and rows, extracting queries, and extracting results.
Changelog
Click here to see the changelog
- MODULE.bazel
- Added
heu
as a bazel dependency.
- Added
- examples/pir/apsi/data/duplicate_key_db.csv
- Added a CSV file containing example database with duplicate keys.
- examples/pir/apsi/data/duplicate_key_query.csv
- Added a CSV file containing example queries for duplicate keys.
- examples/pir/apsi/data/duplicate_key_target_result.csv
- Added a CSV file containing the expected results for the example queries.
- examples/pir/config/dk_pir_receiver_online.json
- Added a configuration file for the DkPir receiver in online mode.
- examples/pir/config/dk_pir_sender_offline.json
- Added a configuration file for the DkPir sender in offline mode.
- examples/pir/config/dk_pir_sender_online.json
- Added a configuration file for the DkPir sender in online mode.
- psi/algorithm/dkpir/BUILD.bazel
- Added a BUILD file for the
dkpir
algorithm, defining the necessary libraries and dependencies.
- Added a BUILD file for the
- psi/algorithm/dkpir/README.md
- Added a README file describing the DkPir scheme, its application scenarios, and instructions for testing.
- psi/algorithm/dkpir/common.cc
- Implemented common utility functions for DkPir, including polynomial computation, key saving/loading, and result writing.
- psi/algorithm/dkpir/common.h
- Defined common data structures and function prototypes for DkPir.
- psi/algorithm/dkpir/encryptor.h
- Defined a custom Elgamal encryptor without plaintext range restrictions.
- psi/algorithm/dkpir/entry.cc
- Implemented entry points for the DkPir sender and receiver in both offline and online modes.
- psi/algorithm/dkpir/entry.h
- Defined the entry point functions for the DkPir algorithm.
- psi/algorithm/dkpir/oprf_receiver.cc
- Implemented the OPRF receiver component for DkPir.
- psi/algorithm/dkpir/oprf_receiver.h
- Defined the OPRF receiver class for DkPir.
- psi/algorithm/dkpir/query.h
- Defined the
DkPirQuery
class, extending the baseapsi::sender::Query
to include a sender count database.
- Defined the
- psi/algorithm/dkpir/receiver.cc
- Implemented the DkPir receiver logic, including OPRF request handling and query processing.
- psi/algorithm/dkpir/receiver.h
- Defined the DkPir receiver class.
- psi/algorithm/dkpir/secret_key.proto
- Defined the protobuf message for storing secret keys.
- psi/algorithm/dkpir/sender.cc
- Implemented the DkPir sender logic, including OPRF response generation and query processing.
- psi/algorithm/dkpir/sender.h
- Defined the DkPir sender class.
- psi/algorithm/dkpir/sender_cnt_db.cc
- Implemented the logic for generating the sender count database, including encryption of row counts.
- psi/algorithm/dkpir/sender_cnt_db.h
- Defined the functions for generating the sender count database.
- psi/algorithm/dkpir/sender_dispatcher.cc
- Implemented the DkPir sender dispatcher, handling OPRF requests and queries.
- psi/algorithm/dkpir/sender_dispatcher.h
- Defined the DkPir sender dispatcher class.
- psi/apps/psi_launcher/BUILD.bazel
- Added
//psi/algorithm/dkpir:entry
as a dependency topsi_launcher
.
- Added
- psi/apps/psi_launcher/launch.cc
- Added
RunDkPir
functions to handle DkPir sender and receiver configurations.
- Added
- psi/apps/psi_launcher/launch.h
- Added function declarations for
RunDkPir
.
- Added function declarations for
- psi/proto/entry.proto
- Added
DkPirSenderConfig
andDkPirReceiverConfig
to theLaunchConfig
message.
- Added
- psi/proto/pir.proto
- Added
DkPirSenderConfig
andDkPirReceiverConfig
messages, and importedpsi/proto/psi.proto
.
- Added
- psi/utils/BUILD.bazel
- Added
csv_converter
library and its test.
- Added
- psi/utils/csv_converter.cc
- Implemented the
ApsiCsvConverter
class for converting CSV files.
- Implemented the
- psi/utils/csv_converter.h
- Defined the
ApsiCsvConverter
class.
- Defined the
- psi/utils/csv_converter_test.cc
- Implemented tests for the
ApsiCsvConverter
class.
- Implemented tests for the
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Trivia time!
What does the acronym PIR stand for in the context of Private Information Retrieval?
Click here for the answer
PIR stands for Private Information Retrieval.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
The pull request introduces a new feature for Duplicate Key PIR (DkPir), including necessary code and configuration files. The changes seem well-structured, but there are a few points that need attention to ensure correctness and maintainability.
Summary of Findings
- Missing License Information: The MODULE.bazel file lacks license information for the newly added dependency 'heu'. This should be added to comply with licensing requirements.
- Potential Data Leakage: The current implementation of the DkPir scheme relies on the assumption that all client queries are present in the server's database. If this assumption is not met, the row count verification may fail, potentially leaking information about the client's query.
- Lack of Input Validation: The code does not appear to have sufficient input validation, especially for file paths and configurations. This could lead to unexpected behavior or security vulnerabilities.
Merge Readiness
The pull request introduces a significant new feature, but there are several issues that need to be addressed before merging. Specifically, the missing license information, potential data leakage, and lack of input validation should be resolved. I am unable to approve this pull request, and recommend that it not be merged until these issues are addressed.
I have read the CLA Document and I hereby sign the CLA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
按照之前的Review结果均做了修改,算法Review这边无问题,代码本身方面我这边也暂未发现问题。请工程同学再DoubleCheck一下。
LGTM |
int SenderOnline(const DkPirSenderOptions &options, | ||
const std::shared_ptr<yacl::link::Context> &lctx); | ||
|
||
int ReceiverOnline(const DkPirReceiverOptions &options, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add unit tests and benchmarks for this new algorithm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
之前有写过单测,但会出现超时,有可能与link有关,而使用命令行运行不会出现该问题。benchmark回头我会补充一下。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
可以把单测代码贴上来看看为啥会卡
@@ -244,4 +245,79 @@ PirResultReport RunPir(const ApsiSenderConfig& apsi_sender_config, | |||
return PirResultReport(); | |||
} | |||
|
|||
PirResultReport RunDkPir(const DkPirReceiverConfig& dk_pir_receiver_config, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add unit tests for these new entries and launch functions.
psi/algorithm/dkpir/sender.h
Outdated
|
||
namespace psi::dkpir { | ||
class DkPirSender : public psi::apsi_wrapper::Sender { | ||
public: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- 感觉之前的实现里Sender更像是个命名空间,而不是个类,这个地方的集成比较奇怪
- 当前的sender和老的sender有大量的重复
- 我感觉从算法上来说其实可以做的更内聚一点,可以把APSI就当做一个传输通道,真正预处理还有cnt处理的逻辑可以抽出来而不是耦合在sender和receiver的逻辑里面,看着不是很清晰,理论上你可以抛过APSI,你的算法代码应该可以做一个流程上的单测的。可以参考当前SealPIR或者SPiralPIR下面的做法,把APSI就当个通信功能,这样你的算法实现就可以比较干净。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
感谢您的意见,我再尝试调整一下
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我也看了一下实现,确实OPRF那个地方的耦合会比较紧,只是建议啊,看能不能把APSI抛开不涉及通信搞一个单测
No description provided.