Duplicate file in different volumes

Forums

I need to find duplicate files in different volumes.
Top 100 biggest file on different volumes
Top 50 biggest folders on different volumes
Files older than 1 December 2007.
Can somebody help me with the query statement

The top 100 biggest files on different volumes. Different volumes, specified list or against all volumes that you have scan data for?

Top-5-largest-files-volume shows the top 5 files by size, but change the last line from 5 to 100. This will query the top 100 biggest files across all fs_scan_policy targets. If you need to only report on some of your policy targets, than the query will have to be changed some.

Question: Files older than 1 December 2007, (SQL Server or PostgreSQL)? Second question: Older by which time: modify_time, create_time, or accessed_time?

Here is a rough draft for PostgreSQL: (you will need to edit the path in line 9)

WITH
rp(ns_left, ns_right, root_path, scan_id) AS (SELECT sd2.ns_left,
sd2.ns_right,
sd2.fullpath AS root_path,
s.id AS scan_id
FROM srs.scan_data AS sd
INNER JOIN srs.scan_data AS sd2 ON sd2.path_depth = sd.path_depth + 1
INNER JOIN srs.scans AS s ON (s.id = sd.scan_id) AND (s.id = sd2.scan_id)
WHERE (sd.idx = sd2.parent_idx) AND
(sd.fullpath_hash IN (srs.path_hash('\\CCC4\SHARE'))) AND
(s.progress_status = 4) AND
(s.scan_type = 1) AND
(sd.path_type = 2) AND
(sd2.path_type = 2))
SELECT
sd.fullpath,
sd.size,
srs.byte_string(sd.size) AS size_string,
sd.modify_time
FROM
srs.scan_data AS sd
INNER JOIN rp
ON rp.scan_id = sd.scan_id
WHERE
(sd.ns_left >= rp.ns_left) AND
(sd.ns_right <= rp.ns_right) AND
(sd.path_type = 1) AND
(sd.modify_time < date '2007-12-01')
GROUP BY
sd.fullpath, sd.size, sd.modify_time

Top 50 Folders by size per scan_target: (Let me know if this is what you were hoping for?)

WITH
x(identity_system, scan_target, fullpath, size, size_string, path_type, rownum, owner) AS (SELECT sd.identity_system,
sd.scan_target,
sd.fullpath,
sd.size,
srs.byte_string(sd.size) AS size_string,
sd.path_type,
Row_Number() OVER (PARTITION BY sd.scan_target ORDER BY sd.size DESC, sd.fullpath_hash) AS rownum,
sd.owner_display_name AS owner
FROM srs.current_fs_scandata AS sd
WHERE (sd.path_type = 2) AND
(sd.path_depth > 0))
SELECT
*,
srs.byte_string(CAST(Sum(x.size) OVER (PARTITION BY x.scan_target) AS BIGINT)) AS target_total_string,
x.size AS size1
FROM
x
WHERE
(x.rownum <= 50) AND
(x.size > 0)