Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: refactor requests from freetexts views to other similar views #660

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

sugat009
Copy link
Member

@sugat009 sugat009 commented Feb 14, 2025

Description

Refactored calls to freetexts views in cht-core to use other views instead because the freetext views will soon be removed. Check ticket for more info.

Partially solves: medic/cht-core#9692

Code review items

  • Readable: Concise, well named, follows the style guide, documented if necessary.
  • Documented: Configuration and user documentation on cht-docs
  • Tested: Unit and/or integration tests where appropriate
  • Backwards compatible: Works with existing data and configuration. Any breaking changes documented in the release notes.

License

The software is provided under AGPL-3.0. Contributions to this project are accepted under the same license.

@sugat009
Copy link
Member Author

@jkuester Looking at the implementation of getReportsForContacts, I think we need another REST endpoint and a function in cht-datasource for retrieving reports as a whole and not just the UUIDs. Also, we only support the searching of reports(and also contacts) by a single freetext, here it seems the the reports are being searched by a list of freetexts. We should probably support this functionality as well.

@jkuester
Copy link
Contributor

😅 Wow, it did not take us long to find new workflows to support here!

I think we need another REST endpoint and a function in cht-datasource for retrieving reports as a whole and not just the UUIDs

Eventually we may decide we want to do this (and the Qualifier structure in cht-datasource should hopefully make this easy to support). But for now, I think it is okay that getting the docs for a freetext search requires two calls. One to search for the UUIDs and a second to go get the contacts for those UUIDs.

The main problem with that is we do not actually support the second call yet... 🤦 😞 There is no way to get multiple contact docs by UUID at the same time. It should be trivial to add support for this (but we should not do it in the current cht-datasource PR).

here it seems the the reports are being searched by a list of freetexts

This is interesting functionality. I guess you basically end up with the union of all the key matches. This is another workflow that we could eventually support, but it feels like a bit of an edge case in the context of freetext searching. (An intersection would seem to be more useful to me.) In general, I think it is acceptable to say folks need to just make multiple search requests if they want to search with different keys.

However, the cht-conf functionality linked here does complicate this since we could end up with a large number of requests....


The more I look at this cht-conf code, though, the more I am wondering if we need the freetext search at all?!? 🤔 Basically this code is abusing the key:value searching to track down reports for a particular contact. Unless I am missing something, this searching could be more efficiently done by using the purpose-built views to get this data.

  const createdAtKeys = createdAtId ? [
    [`patient_id:${createdAtId}`],
    [`patient_uuid:${createdAtId}`],
    [`place_id:${createdAtId}`],
    [`place_uuid:${createdAtId}`]
  ] : [];

To get all the reports associated with the patient_id/patient_uuid/place_id/place_uuid we should just be making a medic-client/reports_by_subject query. Right?

  const createdByKeys = createdByIds.map(id => [`contact:${id}`]);

Oh woah! This is it! 😆 Mokhtar and I spent a bunch of time trying to figure out why we had this weird custom emission from reports_by_freetext. I am now guessing it was to support this code here. 🤦

Regardless, this does not need to be a freetext query. I think we should be able to use medic-scripts/data_records_by_ancestor or even just medic-scripts/data_records to query for the same results.

@dianabarsan is there any concern with calling medic-scripts views from cht-conf? We could keep doing the freetext query for contact:${id} if that would be better....

@dianabarsan
Copy link
Member

dianabarsan commented Feb 17, 2025

I think medic-scripts/data_records_by_ancestor needs to be modified to support this use case, as it will emit for the contact and every saved ancestor, and the same applies for medic-scripts/data_records. this means that if you search for a contact, you will get results for that contact even if it was not the direct submitter. Which in the case here might actually be more correct??? because those reports would end up with incorrect submitter lineages if they weren't changed.
Of course, we should only search for person submitters which should never have any descendants. But who knows what the future holds!

I think indexing reports by submitter is a pretty important functionality in the app, that we might not want to lose. Also medic-scripts feels like a third grade citizen in terms of ddoc importance.

I firmly believe we should revise our ddocs and make sure we don't have too much duplicated emissions. Old Couch would never index views that are not used, but new couch will index everything in the background unless expressly specified not skip.

To conclude:
I think we should have a distinctive way of linking reports to their submitters! Which ddoc the info lives in is less important than having the functionality. But the functionality needs to exist and be straight forward, instead of buried in some ddoc nobody reads or that is considered 3rd class citizen.
Up until now, freetext search was a jack-of-all-trades to indexing and was just used broadly for everything.

@jkuester
Copy link
Contributor

Thanks @dianabarsan!

I think medic-scripts/data_records_by_ancestor needs to be modified to support this use case, as it will emit for the contact and every saved ancestor, and the same applies for medic-scripts/data_records. this means that if you search for a contact, you will get results for that contact even if it was not the direct submitter.

Ah, yes, this makes sense. I had missed the fact that the descendant's reports would also come back for an ancestor contact.

Which in the case here might actually be more correct??? because those reports would end up with incorrect submitter lineages if they weren't changed.

In this case, I don't think we want to worry about descendant contact's reports. Those are handled elsewhere in the code flow (and then the lineage hierarchy of all the affected contacts/reports gets totally re-written to be correct. For the purposes of this PR, I am fine assuming the current code flow is "correct" and for this query we only care about getting the reports submitted by the actual contact.

I firmly believe we should revise our ddocs and make sure we don't have too much duplicated emissions
...

think we should have a distinctive way of linking reports to their submitters!

10000% agree. This needs to happen, but just not as part of this initial TCO/freetext work! I really want to avoid any other major uplifts creeping in here... 😬


@sugat009, going through this again, I think we are still good using the medic-client/reports_by_subject index to do the subjectId lookup. (This is what we are already using in shared-libs/search anyway to look up the contact's reports by subjectId.) This should work with basically all CHT versions without a need to have any version flagging.

Then, we can continue doing the contact:${id} lookup with the freetext views. We can use a version flag to determine which API to hit. If we can still do the multi-key search with the Nouveau endpoint, then I think we should just directly query that. Otherwise, if we are limited to 1 key per query request, it seems better to just use our REST api.

@sugat009 sugat009 force-pushed the 9692-refactor-medic-code branch from 1eae023 to 05f2355 Compare February 24, 2025 11:53
@sugat009 sugat009 changed the title feat: refactor requests to freetexts views to REST API instead feat: refactor requests to freetexts views to other views Feb 24, 2025
@sugat009 sugat009 changed the title feat: refactor requests to freetexts views to other views feat: refactor requests to freetexts views to similar views Feb 24, 2025
@sugat009 sugat009 requested a review from jkuester February 24, 2025 14:38
@sugat009 sugat009 self-assigned this Feb 24, 2025
@sugat009 sugat009 added the Type: Feature Add something new label Feb 24, 2025
@sugat009 sugat009 marked this pull request as ready for review February 24, 2025 14:44
@sugat009 sugat009 changed the title feat: refactor requests to freetexts views to similar views feat: refactor requests from freetexts views to similar views Feb 24, 2025
@sugat009 sugat009 changed the title feat: refactor requests from freetexts views to similar views feat: refactor requests from freetexts views to other similar views Feb 24, 2025
@jkuester
Copy link
Contributor

Okay, so silly question here, but I think maybe you forgot about (or misunderstood) the last half of my comment... 😅

Then, we can continue doing the contact:${id} lookup with the freetext views. We can use a version flag to determine which API to hit. If we can still do the multi-key search with the Nouveau endpoint, then I think we should just directly query that. Otherwise, if we are limited to 1 key per query request, it seems better to just use our REST api.

If we have a createdAtId we want to do the medic-client/reports_by_subject query with the createdAtId as the key.

Additionally, regardless of the createdAtId value, we want to do the freetext query with const createdByKeys = createdByIds.map(id => [contact:${id}]); as the keys. Depending on the current version of the CHT, we either need to hit medic-client/reports_by_freetext or medic-nouveau/_nouveau/reports_by_freetext. I did play around with the Nouveau endpoint a bit and confirmed that it does support a multi-key UNION search basically by just using the OR keyword. However, the contact:id query will not work on the current version of the index we have in the couchdb-nouveau branch since it looks like Mokhtar did not include the special handling for the contact field. Once we get that updated, we should be able to search against the contact field with each of the createdByIds similar to what we can do with the existing medic-client/reports_by_freetext query.

@sugat009
Copy link
Member Author

@jkuester I've left out the checking for specific version of CHT and querying against nouveau for now as the release version is not fixed yet. I've left a TODO comment which is why sonarlint is failing. Probably can be ignored.

@sugat009
Copy link
Member Author

UPDATE: on today's Hosting TCO squad it was decided that the query to the Nouveau views will be done without the version check at the moment to have the code ready to be shipped. When the release number is decided then the if check will be added and cht-conf will be released. Ref.

@jkuester
Copy link
Contributor

Might be convenient to just include an if-check against a placeholder version so you can fully implement both sides of the if-else along with unit tests. Then, once we have a real CHT version, it will be a simple copy-paste to update the version we are using.

@jkuester
Copy link
Contributor

FYI @sugat009 I just pushed a change to the cht-core couchdb_nouveau branch to add back the special contact keys for reports. Just testing locally, I was able to do a multi-key search against medic/_design/medic-nouveau/_nouveau/reports_by_freetext with the following body:

{
    "q":"contact:0790b03c-3a75-415f-8cf3-9d6dae8f16d0 OR contact:8e9f059d-9cd0-4da3-98ac-ef6aaf0b9c1f"
}

I think this should be enough to unblock you here, but LMK if you have issues.

});
});

it('move health_center_1 to district_2 in CHT version 4.99.0', async () => {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've put in the version 4.99.0 for now because we are not sure which version the nouveau stuff will be shipped in, and it provides us with a good buffer. We may need to change the version here once Nouveau is released.

@sugat009
Copy link
Member Author

sugat009 commented Mar 6, 2025

@jkuester This is ready for review now.

@sugat009 sugat009 marked this pull request as draft March 11, 2025 06:02
@sugat009 sugat009 marked this pull request as ready for review March 11, 2025 08:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Feature Add something new
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants