81
clrmame Discussion / Re: clrmame (scanner 0.08.1, rebuilder 0.15.1) released
« on: 09 December 2024, 06:55 »
I just scanned the snaps dat (the biggest single one) on a decompressed snaps set (so it needs to calculate the hashes) and it took < 1 min in a not cached scan and 4 seconds in a second, cached scan....
Keep in mind that the "all project" datfile uses a totally different folder structure than the single progetto datfiles. The all datfile uses set subfolders (like artpreview\artpreview\005.png, where the 1st artpreview is the setname and "artpreview\005.png" is the romname. While the single ones don't use setsubfolders (besides of SoftwareList dats).
So if you don't already have the files stored in the all-project way, there will be >300.000 files to be reorganized. Same would happen if you specify the rompath wrong which is -when it comes to progretto snaps- a rather common problem.
People need to remember the storing method which needs to be used rompath\setname\file1...file n for decompressed sets, rompath\setname.zip (.7z) for compressed sets. So for example if you're using the "bosses" progretto snaps datfile it contains 1 set called "bosses" where all pngs are stored in that set. So if you have a rompath called "c:\test\progretto\bosses", you either have a bosses.zip file in that folder holding the pngs or you got another subfolder in there "c:\test\progretto\bosses\bosses" which holds the singe pngs decompressed.
Progetto snaps dats usually have 1 (SL maybe a handful more) sets with thousands of pngs inside it. Having them compressed can cause slow downs when fixing (especially when using solid 7z sets).
Another point when using decompressed files is of course: each file needs to be read and the crc32 (at least) is calculated...a full progetto snaps collection is ~300000 files...now you can calculate yourself how long it can take for decompressed sets....that can be hours
let's take for example 1/100 second for reading the file plus calculating the hash.... that will be 100 files per second....for 300k files it's 3000 seconds.....50 minutes 0.83 hours....if it's 1/10 second per file...you already end up with 8 hours....and so on ;-)
....and another point....is that the less set specifications you have (and in the progetto case you usually only have 1), the less speed you will get from threading, since threading works on checking sets in parallel, not files....
But to give this thread an end....I will look what can be optimized when working with huge 1 set dats where the files are decompressed. ....actually I found something which can be improved already...so will see what I can do.....
Keep in mind that the "all project" datfile uses a totally different folder structure than the single progetto datfiles. The all datfile uses set subfolders (like artpreview\artpreview\005.png, where the 1st artpreview is the setname and "artpreview\005.png" is the romname. While the single ones don't use setsubfolders (besides of SoftwareList dats).
So if you don't already have the files stored in the all-project way, there will be >300.000 files to be reorganized. Same would happen if you specify the rompath wrong which is -when it comes to progretto snaps- a rather common problem.
People need to remember the storing method which needs to be used rompath\setname\file1...file n for decompressed sets, rompath\setname.zip (.7z) for compressed sets. So for example if you're using the "bosses" progretto snaps datfile it contains 1 set called "bosses" where all pngs are stored in that set. So if you have a rompath called "c:\test\progretto\bosses", you either have a bosses.zip file in that folder holding the pngs or you got another subfolder in there "c:\test\progretto\bosses\bosses" which holds the singe pngs decompressed.
Progetto snaps dats usually have 1 (SL maybe a handful more) sets with thousands of pngs inside it. Having them compressed can cause slow downs when fixing (especially when using solid 7z sets).
Another point when using decompressed files is of course: each file needs to be read and the crc32 (at least) is calculated...a full progetto snaps collection is ~300000 files...now you can calculate yourself how long it can take for decompressed sets....that can be hours
let's take for example 1/100 second for reading the file plus calculating the hash.... that will be 100 files per second....for 300k files it's 3000 seconds.....50 minutes 0.83 hours....if it's 1/10 second per file...you already end up with 8 hours....and so on ;-)
....and another point....is that the less set specifications you have (and in the progetto case you usually only have 1), the less speed you will get from threading, since threading works on checking sets in parallel, not files....
But to give this thread an end....I will look what can be optimized when working with huge 1 set dats where the files are decompressed. ....actually I found something which can be improved already...so will see what I can do.....