You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We encounter an issue when we use URL Toolbox with subdomains that are not in DAT lists used by the python script.
It seems that the script truncate and merge the end of the URL instead of keeping the last string after a dot.
Here are some examples :
test.containers.internal --> ut_subdomain = "test.containers" instead of "test", ut_domain = "int.host" instead of "containers.internal", ut_tld = "host" instead of "internal"
test.redhat.com.localdomain --> ut_subdomain = "test.redhat.com" instead of "test.redhat", ut_domain = "localdo.com" instead of "com.localdomain", ut_tld = "com" instead of "localdomain"
test.centos.pool.ntp.org.xxxlocal --> ut_subdomain = "test.centos.pool.ntp.org" instead of "test.centos.pool.ntp", ut_domain = "xxxl.org" instead of "org.xxxlocal", ut_tld = "org" instead of "xxxlocal"
When we add the TLD in DAT files used by the python script for the lists, it works well. Nevertheless we cannot add all possible and imaginable cases. The impact of this issue is concerning the correlation searches that does not detect the correct values.
Would it be please possible to update the python script to change this behavior when it does not find the TLD in DAT files and keep the correct values ? Or maybe is there a reason for that ?
We thank you in advance.
Best regards,
D.BRANGER
The text was updated successfully, but these errors were encountered:
Using any library like tldextract, which relies on the Public Suffix List (PSL) to accurately separate a URL's subdomain, domain, and public suffix, doesn't solve the issue. The workaround you mention—adding the TLD in the DAT files used by the Python script for the lists—works. Changes to the code and tests have not been successful in creating a common pattern.
I’ll keep this open. Please let me know if you have any thoughts or suggestions. I’d greatly appreciate any help or feedback!
Thank you! I'll keep you posted if I make any changes.
Hi,
We encounter an issue when we use URL Toolbox with subdomains that are not in DAT lists used by the python script.
It seems that the script truncate and merge the end of the URL instead of keeping the last string after a dot.
Here are some examples :
When we add the TLD in DAT files used by the python script for the lists, it works well. Nevertheless we cannot add all possible and imaginable cases. The impact of this issue is concerning the correlation searches that does not detect the correct values.
Would it be please possible to update the python script to change this behavior when it does not find the TLD in DAT files and keep the correct values ? Or maybe is there a reason for that ?
We thank you in advance.
Best regards,
D.BRANGER
The text was updated successfully, but these errors were encountered: