Duplicate file across the tenant
Duplicate file across the tenant
Details
Summary
This report utilizes the new added Microsoft 365 scanning features of version 4.0. The 4.0 tenant scan prompts the new Agent365 to collect a hash of the file content and store the hash in the database where it can be use to compare against other files that match in content.
This query is designed to show show the different joins work together to get the desired data.
Code
WITH q AS (SELECT di.name, di.web_url, di.size, srs.byte_string(di.size) AS size_string, uc.display_name AS created_by, um.display_name AS modified_by, count(*) OVER (PARTITION BY di.file_hash) AS hash_count, srs.bytes_to_hex_string(di.file_hash) as file_hash FROM ms365.drive_items AS di LEFT OUTER JOIN ms365.users AS uc ON uc.ms365_id = di.created_by LEFT OUTER JOIN ms365.users AS um ON um.ms365_id = di.modified_by LEFT OUTER JOIN ms365.drives AS d ON d.ms365_id = di.ms365_drive_id LEFT OUTER JOIN ms365.drive_scans AS ds ON ds.id = di.scan_id WHERE (di.file_hash IS NOT NULL) AND (di.item_type = 1) AND (ds.scan_state = 1)) SELECT q.* FROM q WHERE q.hash_count >= 2
Post date
Thursday, January 7, 2021 - 15:51
Last modified
Friday, April 26, 2024 - 12:48
Downloads
Attachment | Size |
---|---|
duplicate file across the tenant.zip | 3.94 KB |
Sample Report
Attachment | Size |
---|---|
duplicate file across the tenant.pdf | 161.04 KB |