EMULAB Forum

Please login or register.

Login with username, password and session length
Advanced search  

News:

The new forum is online, hope you enjoy it!

Pages: [1] 2 3   Go Down

Author Topic: 7z blockwise memory decompress  (Read 16474 times)

Roman

  • Global Moderator
  • Member
  • ***
  • Karma: 88
  • Offline Offline
  • Posts: 2797
  • Operating System:
  • Windows 7/Server 2008 R2 Windows 7/Server 2008 R2
  • Browser:
  • Chrome 33.0.1750.117 Chrome 33.0.1750.117
    • View Profile
7z blockwise memory decompress
« on: 28 February 2014, 18:42 »

As you might notice from the amount of releases I'm pretty busy with real life.....and so looking at 7z is nearly impossible at the moment...

Looking at 7z currently means: trying to get the c++ part of the 7z SDK to work...the old routines use the C core of the SDK, which is limited when it comes to blockwise operations (everything is decompressed to memory/file in one go) and > 4GB support.

While I had no problem to use the general reading/extracting mechanisms of the C++ COM part I'm still looking at way to have a callback ready which is able to update calculated hash values or -most likely the same callback- decompress data blockwise to memory (for testing purposes)...

I might have overseen such a callback due to my lack of time.....but I only found some update callback which can e.g. be used to print out a progress percentage value...but not giving me access to a memory block/size (to update my hash calcs etc)...

So....if anyone knows where to find this...you're more than welcome...
Logged


Roman

  • Global Moderator
  • Member
  • ***
  • Karma: 88
  • Offline Offline
  • Posts: 2797
  • Operating System:
  • Windows 7/Server 2008 R2 Windows 7/Server 2008 R2
  • Browser:
  • Chrome 33.0.1750.117 Chrome 33.0.1750.117
    • View Profile
Re: 7z blockwise memory decompress
« Reply #1 on: 03 March 2014, 13:03 »

ok...got the answer from the 7z author himself...


"
Client7z.cpp:
CArchiveExtractCallback::GetStream
_outFileStreamSpec = new COutFileStream;
CMyComPtr<ISequentialOutStream> outStreamLoc(_outFileStreamSpec);
if (!_outFileStreamSpec->Open(fullProcessedPath, CREATE_ALWAYS))
{
  PrintError("Can not open output file", fullProcessedPath);
  return E_ABORT;
}
You must create object of your class, that implements ISequentialOutStream and calculates Hash
"



now I only need to time to do so :-)
Logged

Roman

  • Global Moderator
  • Member
  • ***
  • Karma: 88
  • Offline Offline
  • Posts: 2797
  • Operating System:
  • Windows 7/Server 2008 R2 Windows 7/Server 2008 R2
  • Browser:
  • Chrome 33.0.1750.146 Chrome 33.0.1750.146
    • View Profile
Re: 7z blockwise memory decompress
« Reply #2 on: 05 March 2014, 08:53 »

hmm...ok...seems that after years I start to understand that CPP core.... :-)
Logged

Roman

  • Global Moderator
  • Member
  • ***
  • Karma: 88
  • Offline Offline
  • Posts: 2797
  • Operating System:
  • Windows 7/Server 2008 R2 Windows 7/Server 2008 R2
  • Browser:
  • Chrome 33.0.1750.146 Chrome 33.0.1750.146
    • View Profile
Re: 7z blockwise memory decompress
« Reply #3 on: 07 March 2014, 10:39 »

ok...7z fans out there..

I got the blockwise decompress to memory with hash calculation hooked up...nice...works as it should...next step is to change the actual decompress to hd to the CPP SDK core, too...ain't hard..it's nearly the same code...

Currently I don't know how much time I got before Easter...but looks like you can be sure that the next version won't have any memory issues with (large) 7z files anymore...this should also increase speed for 7z operations which use decompress/hashcalcs.

Compression will still be done via external packer though...(now that I slightly understand how this stuff works I might look at that some day, too).

Well...keep you updated....



Little update:

File extract to HD works, too.... so now on the to do list is actually only testing (esp. with corrupt 7z files)...and maybe I change the table of contents reader also to the new cpp 7z sdk core.....
As mentioned before...I nearly don't have any free time...so don't expect anything anytime soon...maybe a preview which you can play around with...we'll see.....Currently I'm happy that I finally got this COM CPP core working....yieeeha....and now...a weekend...without any coding...
« Last Edit: 07 March 2014, 20:30 by Roman »
Logged

Roman

  • Global Moderator
  • Member
  • ***
  • Karma: 88
  • Offline Offline
  • Posts: 2797
  • Operating System:
  • Windows 7/Server 2008 R2 Windows 7/Server 2008 R2
  • Browser:
  • Chrome 33.0.1750.146 Chrome 33.0.1750.146
    • View Profile
Re: 7z blockwise memory decompress
« Reply #4 on: 10 March 2014, 22:09 »

another entry in this 7z diary....completely replaced the 7z C SDK usage with the 7z C++ COM part....i.e. central dir reader, extract to disc and extract to memory with hash calculation is now based on the current C++ SDK core....which is...erm...cool... ;-)
Logged

oxyandy

  • Member
  • *
  • Karma: 5
  • Offline Offline
  • Posts: 266
  • Operating System:
  • Windows XP Windows XP
  • Browser:
  • Firefox 25.0 Firefox 25.0
    • View Profile
    • .
Re: 7z blockwise memory decompress
« Reply #5 on: 10 March 2014, 23:27 »

Quote
which is...erm...cool... ;-)

Yes, very cool :D
If you need a tester....

I have some archives put aside that have been previously troublesome.
Logged

Roman

  • Global Moderator
  • Member
  • ***
  • Karma: 88
  • Offline Offline
  • Posts: 2797
  • Operating System:
  • Mac OS X Mac OS X
  • Browser:
  • Safari 7.0 Safari 7.0
    • View Profile
Re: 7z blockwise memory decompress
« Reply #6 on: 11 March 2014, 05:30 »

yeah yeah.  test version is on its way.....
Logged

oddi

  • Member
  • *
  • Karma: 2
  • Offline Offline
  • Posts: 178
  • Operating System:
  • Windows NT 6.3 Windows NT 6.3
  • Browser:
  • Firefox 27.0 Firefox 27.0
    • View Profile
Re: 7z blockwise memory decompress
« Reply #7 on: 11 March 2014, 10:11 »

Tester #2 waiting...:)
« Last Edit: 11 March 2014, 10:12 by oddi »
Logged

Roman

  • Global Moderator
  • Member
  • ***
  • Karma: 88
  • Offline Offline
  • Posts: 2797
  • Operating System:
  • Windows 7/Server 2008 R2 Windows 7/Server 2008 R2
  • Browser:
  • Chrome 33.0.1750.146 Chrome 33.0.1750.146
    • View Profile
Re: 7z blockwise memory decompress
« Reply #8 on: 11 March 2014, 10:33 »

thanks for the interest....
so...test scenarios could be something like...

> 4gb 7z files
corrupt 7z files
7zfiles with only folder entries
7z without specified file attributes, sizes or crc32 (yes, such things do exist..I'm still looking for the commandline flags to create them...)
utf8 stored filenames and if they are read in correctly...

I expect to put out a test version this week...keep in mind, this does not change slow operations on solid archives...so for now this should resolve issues with huge 7z files where the old C based code wanted to do everything in memory (i.e. it decompressed 4GB to memory to calculate the hash at once....).....and of course the internal use to get rid of the old C based part...
Logged

Roman

  • Global Moderator
  • Member
  • ***
  • Karma: 88
  • Offline Offline
  • Posts: 2797
  • Operating System:
  • Windows 7/Server 2008 R2 Windows 7/Server 2008 R2
  • Browser:
  • Chrome 33.0.1750.149 Chrome 33.0.1750.149
    • View Profile
Logged

Starshadow

  • Member
  • *
  • Karma: 1
  • Offline Offline
  • Posts: 42
  • Operating System:
  • Linux Linux
  • Browser:
  • Firefox 27.0 Firefox 27.0
    • View Profile
Re: 7z blockwise memory decompress
« Reply #10 on: 12 March 2014, 00:21 »

Ready for bugs? :) Using the 20140311 64-bit build, running a scan with the 'Test archive' option enabled, I get nothing but 'error while unpacking 7z file' in the Warnings window. The archives test fine in 7zip. Archive size or algorithm doesn't seem to matter.
Logged

oxyandy

  • Member
  • *
  • Karma: 5
  • Offline Offline
  • Posts: 266
  • Operating System:
  • Windows XP Windows XP
  • Browser:
  • Firefox 25.0 Firefox 25.0
    • View Profile
    • .
Re: 7z blockwise memory decompress
« Reply #11 on: 12 March 2014, 03:52 »

Ok here we go,
Starshadow, I agree:
The Warning window does appear with 'Test archive (decompress to memory) (Scanner only)' ticked.




But as far as:
Quote
I get nothing but 'error while unpacking 7z file'
CMP actually does completely extract and check each file & the test passes !
So the 'get nothing' is a little extreme.
It seems CMP is falsely reporting the error in the warning window, but completing the task successfully anyway.



So warning yes, Task complete without error (on 'integrity tested = ok archive' !)

Roman, now for me I was able to completely create a DIR2DAT of:
268 files = 3.5Gb worth of internal archives in a single 14Mb Solid 7z.
Dat worked perfectly, thank you !
Previously this was not possible with CMP32 running on 32bit OS !
(Neither creating the DAT nor Scanning it)
So yeah, very cool  8)

:More to come later as I complete these tests, next check - scan a hex editor hacked 7z archive (corrupt download simulation)
Logged

oxyandy

  • Member
  • *
  • Karma: 5
  • Offline Offline
  • Posts: 266
  • Operating System:
  • Windows XP Windows XP
  • Browser:
  • Firefox 25.0 Firefox 25.0
    • View Profile
    • .
Re: 7z blockwise memory decompress
« Reply #12 on: 12 March 2014, 05:19 »

BAD NEWS.
As above,  'Test archive (decompress to memory) (Scanner only)' ticked
I cleared cache, even deleted the dat fully and started over.

On my now hacked 7z (Corrupt archive, open-able but fails 'Test Archive' check in both 7z & WinRar)
Does not fail in CMP with 'Test archive (decompress to memory) (Scanner only)' ticked
It should be failing ! (Only 124 of 268 Original files are still good)
Scan completes 'Green'
Using 7z GUI & WinRAR GUI, only a certain percentage of the files can be extracted from the archive.

WinRar extracts 124 of 268 original files (All perfect hashes which are fully rebuild-able)
(WinRar the winner, if the file doesn't match stored hash, winrar wont extract it - unless forced in 'Keep Broken...')

7z extracts ALL 268 of 268 files, only 124 which still have correct hash.
(Long time known bug in 7z GUI (cmd too ????)- I should really report it)

Anyway back on track, dragging & dropping my corrupt 7z onto Rebuilder.
results are as expected 124 files were rebuilt, 144 (corrupt skipped)
However: No warning of corruption was shown in CMP.
Good to know CMP can still make use of the 'good' files, in my bad 7z archive.
The good files are the first 124 packed files in order, the 144 bad the remainder.
Previous CMP builds would fail with 7z: ERROR SZ_ERROR_MEM ->

Repeating 'Test archive (decompress to memory) (Scanner only)'
this time with "Decompress ROM & check SHA1/MD5" ticked in 'HASH & CHD'
Same "error while unpacking 7z file" in warning window..
No incorrect files are reported.
Actually, this time 'Files to go' = 1, and it's just stuck there - wont complete)
The corruption was NOT detected/reported ! :(

Time to wait for next build..
« Last Edit: 12 March 2014, 06:53 by oxyandy »
Logged

oxyandy

  • Member
  • *
  • Karma: 5
  • Offline Offline
  • Posts: 266
  • Operating System:
  • Windows XP Windows XP
  • Browser:
  • Firefox 25.0 Firefox 25.0
    • View Profile
    • .
Re: 7z blockwise memory decompress
« Reply #13 on: 12 March 2014, 05:49 »

Ok so I couldn't leave it alone,

I took my 'Good 7z solid archive' and removed a single entry from the dat.
Now 1 file should be removed from my archive.
Ran Scanner, it instantly picked up the unwanted entry & asked me if I wished to remove..
I said yes, CMP did it's thing, file was removed & newly created archive tests just fine.
So that works.
EDIT: Oh yeah, while it seems I may be stating the obvious here, this was NOT possible with previous CMP builds. A huge improvement.

Hey Readers !
Some-one test the LARGE 7z archives (over 4gb) that failed hash checks previously, post results..
But remember, this is a TEST build of CMP, so any damage to your set should be avoided by working with a set/file copy not the originals, huh.
« Last Edit: 12 March 2014, 06:09 by oxyandy »
Logged

oxyandy

  • Member
  • *
  • Karma: 5
  • Offline Offline
  • Posts: 266
  • Operating System:
  • Windows XP Windows XP
  • Browser:
  • Firefox 25.0 Firefox 25.0
    • View Profile
    • .
Re: 7z blockwise memory decompress
« Reply #14 on: 12 March 2014, 07:42 »

> 4gb 7z files (Waiting for someone else to report)
corrupt 7z files (FAIL ! = Corruption not detected)
7z files with only folder entries (Multiple tests - all passed)
No crash when file with 'folder only entries' was in Rebuilder path:
No problems when 'folder only entries' in Set folder 'Scanner'
Both cases - CMP Reported:
Corrupt Archive File:Q:\Please DAT Me.7z | Reason: NO ENTRIES

7z without specified file attributes, sizes or crc32 ( I know of CMP switch in t7z - but never seen an archive without CRC or size)  :o
utf8 stored filenames
(Multiple tests - all passed)


All normal operations like:
Renames, deletes, additions to destination files, removal from source files etc. (inc utf-8)
Illegal dates.
All seems fine here.
« Last Edit: 12 March 2014, 12:41 by oxyandy »
Logged

Roman

  • Global Moderator
  • Member
  • ***
  • Karma: 88
  • Offline Offline
  • Posts: 2797
  • Operating System:
  • Windows 7/Server 2008 R2 Windows 7/Server 2008 R2
  • Browser:
  • Chrome 33.0.1750.146 Chrome 33.0.1750.146
    • View Profile
Re: 7z blockwise memory decompress
« Reply #15 on: 12 March 2014, 08:07 »

ok thanks...I will check the reported things tonight (now...real life job...and time for a coffee)....

highly appreciate the testing from all of you...

and yeah...still think the C++ SDK switch is cool :)
Logged

oxyandy

  • Member
  • *
  • Karma: 5
  • Offline Offline
  • Posts: 266
  • Operating System:
  • Windows XP Windows XP
  • Browser:
  • Firefox 25.0 Firefox 25.0
    • View Profile
    • .
Re: 7z blockwise memory decompress
« Reply #16 on: 12 March 2014, 12:44 »

7z files with only folder entries (Multiple tests - all passed) Hmm maybe not, read on.
I just wanted to add, I even did a Rebuilder run, where the destination folder contained a 7z Solid Archive with only empty folders inside..
CMP Rebuilder perfectly added all the files to the Archive, so now it contains Empty Folders & files.

So I just tried "Scanner" to detect and remove those "New Folder", "New Folder (1)" Entries..
Which of course are not in the dat..
FAIL, it seems they are not even seen

Ok, next test: A quick edit to my DAT to add some sub-folder paths to this archive, see how it's handled.

The Result:
CMP Noted the folder additions (via DAT edit) "Scanner" made the correct renames to include the Folders.
Great that worked..
Manually checked the archives, the files are there in their Sub-Folders.

Ok, another DAT edit to now change the Names of those Sub-Folders in the DAT to something else..
CMP is asking to Rename those Sub-Folders to the new names made in the DAT edit, good.
But, still NOT seeing the EMPTY folders as unneeded.
« Last Edit: 12 March 2014, 13:28 by oxyandy »
Logged

Starshadow

  • Member
  • *
  • Karma: 1
  • Offline Offline
  • Posts: 42
  • Operating System:
  • Linux Linux
  • Browser:
  • Firefox 27.0 Firefox 27.0
    • View Profile
Re: 7z blockwise memory decompress
« Reply #17 on: 12 March 2014, 13:07 »

CMP actually does completely extract and check each file & the test passes !
So the 'get nothing' is a little extreme.
It seems CMP is falsely reporting the error in the warning window, but completing the task successfully anyway.
I tested on some very large >4GB 7z files first. They failed with the aforementioned warning before enough time had passed to simply read the entire archive from disk let alone test it. When I tested smaller files, I got the exact same behavior. I didn't see any indication that any of them successfully tested.
Logged

oxyandy

  • Member
  • *
  • Karma: 5
  • Offline Offline
  • Posts: 266
  • Operating System:
  • Windows XP Windows XP
  • Browser:
  • Firefox 25.0 Firefox 25.0
    • View Profile
    • .
Re: 7z blockwise memory decompress
« Reply #18 on: 12 March 2014, 13:12 »

Starshadow, what about without -  'Test archive (decompress to memory) (Scanner only)' ticked ?
(or in Scanner - 'Hash & CHD' - "Decompress Rom & CRC/SHA Check" checked.)
Just a regular 'New Scan' ?
Logged

Roman

  • Global Moderator
  • Member
  • ***
  • Karma: 88
  • Offline Offline
  • Posts: 2797
  • Operating System:
  • Windows 7/Server 2008 R2 Windows 7/Server 2008 R2
  • Browser:
  • Chrome 33.0.1750.149 Chrome 33.0.1750.149
    • View Profile
Re: 7z blockwise memory decompress
« Reply #19 on: 12 March 2014, 13:32 »

Looks like it will be a long night :)

Regarding the empty folders....I wonder how old cmpro reacts....

Regarding "without crc32"...that was easy...a 0 byte file is stored without any checksum or lengh information (not even == 0).

So...does this list summarize it so far?

- test archive works but may return a wrong information
- corrupt 7z files are not always marked as being corrupt (-> message in warnings window or something)
- empty folders are not reported as unneeded
Logged
Pages: [1] 2 3   Go Up
 

Page created in 0.067 seconds with 21 queries.

anything