EMULAB Forum

clrmamepro [English] => clrmame Discussion => Topic started by: Roman on 04 July 2011, 20:13

Title: Work In Progress
Post by: Roman on 04 July 2011, 20:13: Well, yes...pretty no news recently....but I'm alive and currently playing around with this:

http://mamedev.emulab.it/clrmamepro/wip_july.png (http://mamedev.emulab.it/clrmamepro/wip_july.png)

Don't ask me when it's done....not much free time these days and several things (e.g. packer support for this) to do...so don't expect anything anytime soon....just wanted to say PEEEEEP....(the names are actually coming from the MESS snes hash files......just took them for some example lines...)

(Update)
ah..nice...it works (e.g. rebuilder) already with decompressed files which I did not expect actually...The shown file and folder was done via a rebuilt

http://mamedev.emulab.it/clrmamepro/wip_july2.png (http://mamedev.emulab.it/clrmamepro/wip_july2.png)

(again, just some dummy test files..ignore naming, sizes, datestamp and checksum)
Title: Re: Work In Progress
Post by: Roman on 06 July 2011, 21:11: So...maybe some more WIP....
so what does this unicode stuff mean at all?

Well, generally since we're now having Operating Systems which support UNICODE (if you don't have an updated OS, well, tough luck), Filenames can be in unicode or to make it simple in local language special characters which are not part of the plain ASCII-7 charset....

And if you can store files that way, you may want to list them in the datfiles correctly spelled.

So the steps to make this possible in cmpro is:

1) make a unicode compile of cmpro
ok...after doing some annoying _T() macro and TCHAR padding and resolving some LPTCSTR pointer conversions (if you're familar with Visual C++ you know what I mean) it finally compiled

2) fixing the common char issues
well, just by adding macros and changing pointertypes doesn't necessarily mean it works after compilation...so some post work needed to be done to fix ugly little char / char* issues

3) reading unicode text files
again some more fiddling since just by having an unicode compile doesn't mean you can read and display unicode characters correctly, but finally I managed it to do that as you see in the screenshots.
So I was positively surprised when I ran a simple rebuild on a dummy utf8 datfile and it created the folder and file in unicode characters.

So...what happened then (aka TODAY)?

Well, I've changed all writing and reading of text files to be utf8 now. Well, it reads anything (plain ascii, utf8) but writes utf8 with a BOM (ByteOrderMark...some bytes at the beginning telling you something about the used encoding). So all cmpro ini files, miss/have list, etc will now be utf8 with a BOM.

In case of XML files, you don't necessarily need an UTF8 BOM now, you can also specify the encoding="utf-8" attribute.
Had to remove the old utf8 xml handling...it's obsolete and it was actually wrong..
(Note to myself...hmm..maybe when writing XML files, don't write a BOM but add the encoding attribute....)

So...what's next?
Next step would be to hook up the latest zip library which should be able to handle utf8 encoding.
Internal zip, 7z and rar reader routines needs to be rechecked, too for character conversion.
The compressor settings for OEM2ANSI conversion should then become obsolete...
And then some cleanup and testing...

But again...time is very limited within the next week(s), so...just be patient....
Title: Re: Work In Progress - Update 3
Post by: Roman on 11 July 2011, 20:48: Just some little updates:

- don't write BOM in case of writing XML files (using encoding attribute instead)
- write xml special characters as-is (hey...we're now in an unicode environment, so don't write &#xxxx; (but of course read and parse it correctly)
- hooked up 7z unicode support
- hooked up rar unicode support
- acquired latest full version of ziparchive lib (Thanks a million Tadeusz!)

So actually you can fully use it now for decompressed, 7z and rar files...wooho...
http://mamedev.emulab.it/clrmamepro/wip_july3.png (http://mamedev.emulab.it/clrmamepro/wip_july3.png) (Rebuilt archives/folder)

Next steps:

- hook up latest version of ziparchive
- update internal zip to unicode which is used for in-place renames and no-compress copies
- cleanup zip setting screen, i.e. remove oem conversion, buffer and flush options...and most likely the compression level (internally use highest)...such options became pretty obsolete over the years.
- maybe check additional new features of latest ziparchive lib (>4GB zip support etc...) but that's most likely something for a future update since it most likely mean that I have to replace my internal zip routines (no-recompress/inplace rename) completely...

Again, don't know when the next steps are done (especially since I don't have any free time next week)....but full unicode support is on its way...and as you can see it's already working for decompressed, 7z'ed and rar'ed sets...

Tadaa...
Title: Re: Work In Progress
Post by: Cassiel on 14 July 2011, 20:28: Outstanding mate.... just outstanding! ;D
Title: Re: Work In Progress
Post by: Roman on 14 July 2011, 21:29: ehehe, thanks,

Having a break of a week now due to familiy business...

Today I at least I managed to do some work on the internal/own zip routines so they can read and handle utf8 now....plus some work on the internal stuff for no-recompression copy and inplace-rename...basically it's always a conversion of the filenames from utf8 buffer to something you can display...or viceversa...This needs some more work end of next week...

cough..ok...zip copy without recompression works....next: in-place rename....cough

and then I try to hook up the new ziparchive lib.... and then we're close to something useful....

And now....breeeeeeaaaak...
Title: Re: Work In Progress
Post by: Roman on 24 July 2011, 21:36: back...
at least converted the remaining internal zip routines (repair/in place rename) to utf8...next step (besides some testing) hook up new zip lib...
Title: Re: Work In Progress
Post by: Roman on 25 July 2011, 21:13: no big news today...did some testing of the internal zip routines, some fixing here and there, removed the usage of any oem2ansi conversion, zip flush, zip buffer settings and added the latest ziparchive lib....compiles and links fine. However it seems that I need to pass some special parameters so it works fine with utf8....ok...reading docs now....
Title: Re: Work In Progress
Post by: Roman on 26 July 2011, 20:50: ok...some progress...(does somebody actually reading this diary?????)

So....yesterday I did hook up the latest zipclass lib but today I had several issues with utf8 names in created zips...after some research I found out that I somehow added a wrong version (doooh). A clean remove and reinstall of the latest lib solved all issues (Thanks again Tadeusz!).

All issues solved? well...actually I found out another thing...there are several ways for 'standard' zipfiles to handle utf8 encoded names....one method (which I prefer) is simply store names in utf8 and hope that applications handle it correctly. Winzip (15.x), 7z (9.x) and Winrar (4.x) do...so I stick to this method...The other method is to store the filename differently and keep information about encoding in the zip structure extra fields....actually this also makes the zipfile larger...

Now that it seems that we have full utf8 support for all 3 archive types, I also cleaned up the settings->compressor screen:

* oem2ansi conversion is gone (yes...live with it...use utf8!)
* zip compression level is gone (internally 'best' is used which corresponds to '9')...I don't see any reason why this should be selectable by the user. You can now start a discussion about torrentzip or why not using bzip2 as compression method (which newer zip programs can handle)....but that's something for the bin or the future...
* zip flush option...removed...It's a relict from years ago where people had slow hds and faulty chipsets....
* zip buffer option...removed...In our days I don't think increasing this will give you a speed boost....I may try some internal testing on different values...

So....that's it for today...Next steps will be testing, testing, testing...again, I got no deadline for a release in my mind yet...

Some other remarks: Well, I'm working with Windows7 (ultimate) and it does not have any problems showing asian characters...I know form XP that you need to first install some asian-related charsets/libs so you can view such characters correctly. Windows does that for you...in your system regional settings you find a checkbox to enable asian-character-support.
Datauthors may wonder how to work with utf8 dats...well...get a good texteditor. While I'm a big fan of Textpad, I have to say that for such tasks, Notepad++ (Yes, plus plus, not the notepad.exe from your standard Windows ;)) is great to use since it offers you utf-8 saving/loading options...

Ok...that's it for now....I only wonder what happens if MAME.exe's -listxml prints out utf8 names on stdout and I redirect it and read it in....yay...something to test ;)
Title: Re: Work In Progress
Post by: Cassiel on 26 July 2011, 21:09: Quote from: Roman on 26 July 2011, 20:50
ok...some progress...(does somebody actually reading this diary?????)

I am! ;D

(as is the TOSEC project as a whole)

Quote from: Roman on 26 July 2011, 20:50
You can now start a discussion about torrentzip [...]

Had to smile at that....... ;)
Title: Re: Work In Progress
Post by: f205v on 27 July 2011, 11:40: I am too!
Title: Re: Work In Progress
Post by: Simone on 27 July 2011, 12:48: hei Roman, keep it up ;)
Title: Re: Work In Progress
Post by: oxyandy on 27 July 2011, 15:09: For sure Roman, read all the posts.
Sounds like great progress,
will miss the low level compression setting,
for times when I want to rebuild something quickly though.
Title: Re: Work In Progress
Post by: Roman on 27 July 2011, 15:59: then disable rebuilder's recompress files option ;)
Title: Re: Work In Progress
Post by: Roman on 27 July 2011, 21:36: so, the next entry in the diary...well...nothing really special today...

I've tested some standard zip/rar/7z if they work fine (ehehe...not that I get multibytes for ascii7 chars now....) and checked under which circumstances Winzip creates utf8 encoded files with extra-field usage...actually I wasn't able to produce one...so again...I will stick to the non-extra-field-usage method to use utf8.

I more used today's little time to align scanner's 'allow not separated bios sets' and rebuilders 'split bios sets' option...they are both named identical now (split bios sets). Also I started to remove the scanner advanced '* SysDefPath' options from the UI......they will be internally enabled if sysdefpaths are setup...which makes more sense in my opinion....checking other options as well..maybe some become obsolete to be set by the user....time will tell...
Title: Re: Work In Progress
Post by: oxyandy on 28 July 2011, 10:12: Quote
when I want to rebuild something quickly though.

Doh, I really wasn't thinking when I wrote that, of course I could just untick "Compress Files" too.
Really don't need that setting for compression, eh..
Damn, 3am posts.

Just thought..
Is something like this possible ?
To keep parent/clone relationships ?

(http://www.upload.ee/image/1531308/merge.png)

Plus, is it possible to have a utility to clean zips of any dupe crc files ?
After all, a merged set only needs a single matching crc.
Title: Re: Work In Progress
Post by: Roman on 28 July 2011, 11:56: Enable Profiler -> Options -> parse Rom Merge Tags if you want to get rid of the dupes.
Actually I think you can force to split-merged sets if you want to avoid the removal of parent/clone relationships in case of identical names for non-identical files...but I have to check that....
Title: Re: Work In Progress
Post by: Roman on 28 July 2011, 20:01: ok...back to the main topic...utf8...

one thing I wondered about was....what happens if MAME's -listxml output prints out asian characters....and cmpro's profile points to the mame binary (ok, internally it calls MAME and redirects its output..).

...So I tested it....

1st step...picked a set (for easyness I simply used the first in the xml "005" from segag80r.c) and changed a rom name to something in chinese...saved it as utf8 without BOM (with notepad++), recompiled MAME....a -listxml output lists the asian characters..looks good...

2nd step...since cmpro will parse xml datfiles as utf8 only when they got a BOM or when the encoding is specified in the XML tag, I changed info.c to add an encoding="utf-8" attribute, recompiled MAME, -listxml shows it...fine

3rd step...let cmpro import the data directly.......the stdout redirector did a good job....it works ;)

So...actually if anyone ever decides to update MAME to list set names, rom names, descriptions etc in the original language...fine....DO SO!

http://mamedev.emulab.it/clrmamepro/mame_utf8.png
Title: Re: Work In Progress
Post by: Roman on 03 August 2011, 17:29: well, nothing really new regarding utf8....only made a clean compile with a clean new solution setup in my Visual Studio....actually I somehow screwed up my old one and it had major problems with precompiled headers....hehehe
so..new setup...working fine...

so...I guess next week could be a good start for some testing...if you're interested, let me know your email address...can't guarantee that everyone gets a testversion .... and as I said...earliest is somewhen next week...

By the way, if you got some utf8 dats, please send them in....best would be if the crc32/sha1/sizes match MAME roms :) ....and the names / description / manufacturer tags etc could be something chinese/japanese/etc....

On a sidenote .... if you don't want to see a winrar window popping up when adding/deleting files, add a -ibck in the rar commandline params...
Title: Re: Work In Progress
Post by: Roman on 04 August 2011, 19:59: For those who are testing:

Keep in mind that dats (if they use non-ascii chars) need to be saved as utf8 with or without a BOM (ByteOrderMark). If you don't use a BOM, be sure that your xml datfile holds an encoding="utf-8" attribute within the xml tag at the beginning.
XML dats are prefered of course, however old style dats should work as well (when saved with BOM).
Again, I can recommend notepad++ for easy saving and editing dats...

...and don't expect feedback before next week ;)
Title: Re: Work In Progress
Post by: DopefishJustin on 09 August 2011, 19:33: You can try e.g. fm77av.xml from MESS as an example of Japanese names in UTF-8.

Requiring an encoding declaration for UTF-8 is bogus though because the XML standard mandates UTF-8 as the default for XML documents with no encoding specified:

Quote
In the absence of information provided by an external transport protocol (e.g. HTTP or MIME), it is a fatal error for an entity including an encoding declaration to be presented to the XML processor in an encoding other than that named in the declaration, or for an entity which begins with neither a Byte Order Mark nor an encoding declaration to use an encoding other than UTF-8.

http://www.w3.org/TR/2006/REC-xml11-20060816/#charencoding
Title: Re: Work In Progress
Post by: Roman on 09 August 2011, 20:19: Well, actually I already tried that in the past and complaint that the file either needs a BOM or the encoding attribute...but fine...if the XML standard says that a non existing encoding attribute falls back to utf8, I can easily add that...
Title: Re: Work In Progress
Post by: Roman on 10 August 2011, 19:36: ok...some news...somehow related to utf8

- made stats.ini an utf8 file with BOM (to show middle point at the beginning correctly)
- fixed some name conversion for 7z files
- fixed a typo which caused the batcher to not correctly setup rompaths
- xml files are now loaded as utf8, no matter what encoding you specify

Generally you might face issues when you got existing archives which have the name stored in a local-page encoding...and then they get read as utf8 within cmpro...so you run into some wrong conversions...but I hope over time people get rid of such issues...
Title: Re: Work In Progress
Post by: Roman on 11 August 2011, 07:50: well well well...Looks hard times will come up ;)

while testing I commonly see that users react on wrong character encoding when working with zipfiles which were created outside of cmpro...

So...let's start by saying that utf8 and zip files are two worlds...there is no real standard how files are stored as utf8 encoded.

Winzip15 allows storing the filename as utf8 with no extra information. Simply store the utf8 hex bytes and you're done...fine...cmpro handles that..

If you use an packer version or a different product the name is most likely stored with the local code page encoding....so what will happen, cmpro reads the zipfile in and converts the filename based on the assumption that it's utf8 encoded...which ends with an output you did not expect. If this happens, well, tough luck, let cmpro recreate the file for you....

I'm currently checking if Winrar zipfiles have similar issues or if they are using the 2nd way to store utf8 filenames in zips (by the way, officially winrar supports this for zips since version 3.80) which is by using the zip extra field for some encoding information. If so, I have to update my zipreader a bit to use the extrafield information...

There were comments that the encoding goes nuts after torrentzipping the file..well..then the torrentzip guys should double check their way of encoding names (I assume they do local code page encoding) as well...but I will check if they maybe use the extra-field method already.

rar and 7z archives seem to have no issues at all anymore since the last build....and Winzip created zips and cmpro created zips either.

...so the next step will be to double check winrar zips...

Keep in mind that the goal is to support utf8 encoding, not to support any local page encoding...
Title: Re: Work In Progress
Post by: oxyandy on 11 August 2011, 09:23: Code: [Select]
rar and 7z archives seem to have no issues at all anymore since the last build....and Winzip created zips and cmpro created zips either.

I tested the reading of "Winzip v15.5 created zips" (and other zip creating programs)
and Dir2Dat still creates the same problem as "WinRar created zips"
No file-name is outputted.
It still stands, the only type of zip CMP can read the internal file name from, is a "CMP created zip"

<game name="哈哈哈 (ZIP MADE BY WinZip 15.5)">
<description>哈哈哈 (ZIP MADE BY WinZip 15.5)</description>
<rom name=".txt" size="12" crc="c6302903"/>
</game>

Anyway, enjoy your day at work.
We will discuss more later.
Title: Re: Work In Progress
Post by: Roman on 11 August 2011, 09:29: Check winzips15 options if you got "store filenames as utf8" checkbox ticked....

And for another test: drop the zipfile which gives you an empty filename in dir2dat into the about window...does it list an empty filename there, too?
Title: Re: Work In Progress
Post by: oxyandy on 11 August 2011, 09:37: Ok, did that.

This doesn't say exactly "store filenames as utf8"
But it does say "Store Unicode filenames in zip files"
By default that was already checked.
(http://www.upload.ee/image/1569706/wzv15.5.png)
Title: Re: Work In Progress
Post by: Roman on 11 August 2011, 09:43: yeah yeah..that's the option I was talking about...good to see it's checked...

Now please drag'n drop the file from your dir2dat source (the one which creates the empty filename) into the about window...
a window should open, listing all entries of the zip...I wonder if the name is listed there (correctly).
If it is listed there, then dir2dat somehow cuts off the name....if it's not listed there, then it's related to the encoding / zip reader.
Title: Re: Work In Progress
Post by: oxyandy on 11 August 2011, 09:44: Ok did the drag drop test too..
Same, name missing
Code: [Select]
Path: G:\Desktop\New Folder (10)\哈哈哈 (ZIP MADE BY WinZip 15.5).zip Name: .txt Size: 12 CRC32: c6302903 MD5: 9f8c1945784842810f22262b7d3aef0f SHA1: cfe308fd34664b17cd752537fb04cb59d0bb5070
Title: Re: Work In Progress
Post by: Roman on 11 August 2011, 09:56: thanks! good to know...so it seems it can't convert the chars...so the issue is not in dir2dat but in the reader

I will check if you can specify a default char (like a ?) in case of a not converted char...I assume you've mailed me that "\哈哈哈 (ZIP MADE BY WinZip 15.5).zip"....please do so...I bet it uses the extrafield for additional information
Title: Re: Work In Progress
Post by: oxyandy on 11 August 2011, 10:06: Well,
I haven't emailed the "WinZip v15.5 created zip" yet,
but now you mention it, I will.

I just hope you will humour me,
remove the "forced UTF-8 encoding" in source, compile and get me a copy.

If a build like this reads and stores the name in the dat,
then I bet a beer, it will also output a zip with the filename too.

Then problem over, without you even looking into the internals of the zips.

Code: [Select]
so the issue is not in dir2dat but in the readerCause don't forget it's not just the ZIP reading, but also the ZIP writing which is quirky!
Title: Re: Work In Progress
Post by: Roman on 11 August 2011, 10:15: Again...non-utf8 encoding is not an option here. It will then simply use local page encoding which may work on your side but nowhere else. The goal is to use utf8 stored names and give a s**t on local page encoding.

Zip writing is not a problem. It uses the ziparchive lib for saving (unless you use 'no recompress' in the rebuilder, then the name is converted internally and not in the lib!!!) and that stores the filename utf8 encoded and winzip, winrar and 7z shows the name correctly. It uses the method to store this without EXTRA field usage which all 3 major programs handle fine.
Title: Re: Work In Progress
Post by: Cassiel on 11 August 2011, 16:31: FINALLY got around to doing some testing....

Couple of small things so far:
- dir2dat: creates DAT OK using a mix of files with some VERY exotic file names. When opening DAT in Notepad++, the file "06 'and'.smd" appears as "06 'and'.smd". Shouldn't the ACTUAL character now be shown rather than the code? Doesn't seem to be an issue with other 'weirder' characters.
- scanner: file "02 Diaboł.bin" appears incorrect (visually at least) in the scanner window, i.e. the "ł" is replaced with a black rectangle. Fixdat is correct however. When rebuilt it is also correct in the archive and scanner recognises the file as correct once rebuilt. (packer = rar).

Not experienced any issues with the actual rebuilder.

Using:
CMP (email on 20110810) (x64)
Windows 7 Ultimate (x64)
WinRAR v4.01 (x64)
Notepad++ v5.9.2 (x86, unicode)
Title: Re: Work In Progress
Post by: Cassiel on 11 August 2011, 16:45: Repeated steps using packer = zip (internal zip lib, NOT utilizing WinZip/WinRAR/7z) and experienced no issues with rebuild/scan. Performed exactly the same as with RAR (i.e. the issue with "Diaboł" remains, but rebuilt correctly).
Title: Re: Work In Progress
Post by: Roman on 11 August 2011, 19:25: ok....I won't write apos anymore for an apostrophe... ;)

now back to the general problem: it seems that tools like Winzip do use the extra fields for utf8 filenames. Funnily enough I can use Winzip without doing that and ziparchive lib I'm using can use this mode fine...
So cmpro created files (forget about no-recompress at the moment) work fine in Winzip and within cmpro.
Problematic are files which exist before you run cmpro on them...

Either:
- they got a local code page encoding, then cmpro reads them in, converts them as utf8 (which is wrong) and you got wrong characters. Rezipping them with cmpro will fix this.
- or they use the extra field to store utf8 additional information...which my current parser does not handle....I try to update it...or switch to the ziparchive readers completely...

Thanks for testing...
Title: Re: Work In Progress
Post by: Roman on 11 August 2011, 20:49: well, I quickly wrote a zip parser based on the already used ziparchive lib and it seems to solve the remaining issues with extra-field utf8 etc....(I need some confirmation from oxyandy though)

Problem is....I need to get rid of all the other internal in-place-rename and no-recompress copy routines...The good news is that the library already got such functions, however I need to adopt them...which takes time...which I currently don't really have...

I will have a week end of August....and some hours here and there till then....
Title: Re: Work In Progress
Post by: oxyandy on 12 August 2011, 02:25: Ok,
a really bad night's sleep, my daughter had a terrible fever.

Awake now, first test shows.

1. The old output from rebuilder which was torrentzipped, now shows the incorrect name.
(great cause it is)

2. Any zip I have tested, now shows the correct internal contents !

So Dir2Dat is now producing a dat which matches the ZIP files 100%
8)

Torrentzip handles 'special characters' badly,
clearly would need to be re-written to cope with Unicode names.

EDIT:
Tested everything else I had issues with previously,
Faultless so far.
Time to get imaginative with 'testing'

"the "ł" is replaced with a black rectangle."
There are 1000's of Unicode characters which will show as black rectangles.
Unless the CMP output windows uses a Unicode font. ??
This Font would then need to be supplied with CMP download package as many OS's are likely not to have them.
And then, ah what a nightmare, there are Unicode characters which belong to specific font sets.

Even Notepad++, needs setting to a Unicode Font, otherwise, of course
it wont show the characters of that font set correctly.

So you might make a dat, which has outputted perfectly, (and it really seems to do this perfect now)
but because you haven't set the correct font in Notepad++
will only show as rectangles too...

Hmm, Roman, would it be possible to have a 'selectable font' for CMP's output windows..
::)
While I'm suggesting things, compressor settings..
Could CMP take for granted that everyone has 7z and WinRar already installed, at their 'default' locations, please.
So they work, 'out of the box' ?
Title: Re: Work In Progress
Post by: Roman on 12 August 2011, 07:23: As a father myself I know how these nights are....so all the best for your daughter...

Good to hear that the new reader works fine (again, don't use your special build for any real scanning/fixing. Actually fixing names or no-recompress will use the old routines which will create wrong chars...or even worse since they rely on some attributes which the new reader doesn't fill in yet).

cmpro fonts....hmm...not in the first planned release :)
torrentzip....hmm...I did already mention that this is not my scope ;)
compressor settings...well...when it comes to zip, for now I would go with the standard ones (level 9, deflate) which the ziparchive lib brings....for 7z/rar you can modify the settings/compressor/7z (rar) edit fields with your own commandline params....yes...I expect winrar and 7z to be installed and located in your %PATH% environment. Otherwise you need to adjust the location in the compressor settings.

So next steps: replace old custom zip routines with ziparchive class calls.....
Title: Re: Work In Progress
Post by: Cassiel on 12 August 2011, 12:37: Re output window (big slap for me), didn't occur to me it would be simply due to the font used. Obvious.

Selectable font option (i.e. Ariel is included in Windows, Ariel Unicode is installed by default by Office) would be helpful (and avoid such issues/questions).

Or maybe just use something like:
http://en.wikipedia.org/wiki/GNU_Unifont
by default?
Title: Re: Work In Progress
Post by: Roman on 12 August 2011, 12:55: as I said...."later" ;D
Title: Re: Work In Progress
Post by: oxyandy on 12 August 2011, 14:13: I was thinking old school when I mentioned a 'download' for a Unicode font.
Vista onwards, should have some Unicode Fonts as standard.

EDIT: Even XP has "Arial Unicode MS Font" released as an update at one stage.
Title: Re: Work In Progress
Post by: Roman on 12 August 2011, 21:23: ok...some fine progress this evening...

remember, I had to adjust all internal ziproutines to fully use the lib...fine...wonder why I didn't do it before since it's pretty easy....so say goodbye to strict scanning option, some internal zip repair etc...

Renaming files (or multiple files) within a zip is just some few lines of code now....so inplace rename, case fix, datetime fix are all done...

Still on the list (and that will take a bit more work I guess) is the rebuilder no-recompress option...but the lib got something for this I bet....

Will do some speedtest of heavy renaming in zips or reading zips to see how the lib performs...actually I think it will be faster.
Title: Re: Work In Progress
Post by: oxyandy on 13 August 2011, 04:33: I am happy to help speed test..

I have a set approx 1500 zips, which requires approx 20,000 renames.

Big enough to bench-mark the renaming with "Scanner"
then repeat with current release, if this helps.
Title: Re: Work In Progress
Post by: Roman on 13 August 2011, 20:02: one quick test...

a full MAME 143u2 scan (fullly diskcache buffered though): old and new scanner 18 seconds....maybe the old one a little bit faster (17seconds) :( but hey, the old one was really optimized down to very limited reading of some fields only....

a full rename of MAME32 snap files in the zip (3688 times *.png -> *.xxx ;)): old cmpro 36 seconds...new one 10 seconds ;) yummi....
Title: Re: Work In Progress
Post by: oxyandy on 14 August 2011, 06:39: That really is a quick test..
Seems faster and faster is good.

I have done more testing with beta 2.
It really puts a smile on my face.
I was able to batch load a Tosec dat set, without having errors spit out at me everywhere..
I usually need to go into the dats manually and edit them to just get them to load.

Hanging out for next release.

8)
Title: Re: Work In Progress
Post by: oxyandy on 14 August 2011, 11:31: I thought I was having some kind of hardware failure, possibly RAM.. Hi-end DDR3 4GB..
Is really hot where I live, so hardware failure isnt unlikely.
So, I ran Memtest-86 for a few cycles, no errors.
;D

These betas must have some bad calls to ram ?

This sound right Roman, do you know of this ?

I have a folder with the beta and it's files, and it is constantly, but randomly, crashing.

Maybe I should send the folder 'as is' and some crash logs to you, to repeat the problem.

(http://www.upload.ee/image/1577002/Capture_08142011_184227.png)

This seems to be common with all errors:
instruction at "0x0041d280"
So hope this means something to you.
Title: Re: Work In Progress
Post by: Roman on 16 August 2011, 12:47: I told you that your version is only for a quick test. It uses some not filled structures since I only hooked up a quick'n dirty lib based reader for you which only fills the main attributes for a utf8 name check. Other internally attributes were not correctly filled so that routines like zip in place rename, no-recompress copy and some others will fail....
Title: Re: Work In Progress
Post by: oxyandy on 17 August 2011, 03:14: I am not talking about the last build "new-reader"
the previous version..

Anyway, next build, I know exactly what eЯЯors to look for :D
Title: Re: Work In Progress
Post by: Roman on 17 August 2011, 20:24: Some news..

- completely replaced own zip routines with library. This includes inplace rename, copy-without-recompress (which is actually just a few lines with the lib...coool..hooked up scanner/rebuilder no-recompress to use this...)....some speedtests will follow....
- removed: rebuilderAdvanced buffer and auto mem options for recompress
- removed: compressor / zip settings / full zip structure scan

Next step....testing...and having a look at the new device_ref MAME stuff....

Update...
Little rebuilder no-recompress test....

2315 single files in 240 zips, total zip size is ~3GB (somehow reminds me of some famous bios based collection...:))

New cmpro: 4 min 45 secs
Old cmpro: 7 min 21 secs

yummi.....
Title: Re: Work In Progress
Post by: oxyandy on 19 August 2011, 04:01: unicode_test_20110818

Not much to report yet. Seems to be doing ok.

1. Scanner results window, copy set name to clipboard is only copying first letter.
However fix.dat is saving fine.

2. Batch mode - Rebuilding from 7z & Rar archives seems not to work. ( With either remove source ticked or not )
Zips fine.
Rebuilder, same.
I get the "Want to stop ?" Message. (oh this is with non-unicode file names, so nothing special)

Dragging -dropping either zip, 7z or rar, onto 'about' window, gives correct output. (with unicode file name & contents)
Dir2Dat - 1 folder 3 archives, zip, rar & 7z, gives correct output.

Scanner - With either zip, 7z or rar archived sets, reads fine.
Title: Re: Work In Progress
Post by: Roman on 19 August 2011, 07:05: sounds good! thanks for testing...
1 sounds like tiny bug which maybe related to some char* to TCHAR conversions...I will check that...should be an easy fix
2 sounds hmm interesting...especially since I've done some rar rebuilds lately....but I will check that too...

in the meanwhile I'm working on the "devices" stuff for new MAME....
Title: Re: Work In Progress
Post by: Roman on 19 August 2011, 21:50: ok...fixed the clipboard issue and added partly added "device" support

more tomorrow...rar/7z fix and finalize device support
Title: Re: Work In Progress
Post by: Roman on 21 August 2011, 20:09: so so so...coming closer to a release I guess (currently facing maybe last week in August)...
Thanks to the testers all reported issues are now history...

Additionally some preliminary devices / device_ref was added...well at least they are not handled as bios sets anymore and you can assign a path (like standard or mechanical) to them. There are some restrictions (like device != bios, only one device_ref) but time will tell what MAME will do and what cmpro should check.

On my list now is some final testing (e.g. with set-subfolders) and actually a new documentation....

Thanks again to all testers
Title: Re: Work In Progress
Post by: oxyandy on 22 August 2011, 05:03: I have tested every previous issue reported so far.

Yet to find a glitch,
nothing more said, means nothing found.
I am not done yet.
Title: Re: Work In Progress
Post by: oxyandy on 22 August 2011, 11:05: Bored with testing....

Finding a fault is too hard :P

Rebuilder
Sources
from rar
from 7zip & t7zip
from zip & tzip
from extracted files
Read Only Archives (such as found on DVD/CD)

Any type of output works.

Scanner
7z set
Rar set
Zip set
Read Only Sets

Batch Mode
No glitches loading any dat so far.
Destination folder, sets right path.
Rebuild, remove source, then Scan
Any output.

Dir2Dat
Extracted files with and without sub-folders
Zips with internal folders
Archives with Unicode Characters in file-names and contents

EDIT: OK, like the last time, I now have a replicable crash.
In my example,
after a batch run completes and CMP is showing the "Looking for dat files" msg.

The instruction at "0x0041d300" referenced memory at .......

When I have a chance,
I will transfer my CMP folder 'as is' to other pc or laptop to see if it repeats.
Title: Re: Work In Progress
Post by: Roman on 22 August 2011, 12:32: erm...so what are the exact steps for you issue again? With all these "working" things, I don't see what belongs to the error and what not...Please only list the issue again..thanks

ok..after rereading it's happening while the profiler reloads the profiles/datfiles after a completed batch run....interesting...trying to repeat it here somehow....if you can create a minimum setup....would be great.
Title: Re: Work In Progress
Post by: oxyandy on 22 August 2011, 15:54: The exact steps.

After successfully loading various runs of dats files in batch mode
and having them complete processing as expected.
The crashes start.
Well actually the crashes also happen at other random points.
So I should say, they become regular and predictable.

I have created a screen capture most times when it has crashed.

It doesn't happen when the program loads.

I think it is best you take my whole CMP folder 'as is'

HMM
The easiest and fastest way to repeat the crash, is to right click,
delete any entry in the MESS folder.
I do not feel any file in the MESS folder is the cause...

Anyway if the program crashes on your pc
the output will go straight to the 'debugger' -- right ?
Will this will give you a better idea... ?

Is about 50meg 7z..
Title: Re: Work In Progress
Post by: Roman on 22 August 2011, 16:11: well...screenshots of the windows error message won't help me...it would help me to tell me *where*/*when* it happens...
Right click and delete mess files? right click where? profiler? scanner?
I will do some research this evening...but if you can provide the files somewhere, that would be great...
"randomly" of course doesn't sound that good....maybe a not initialized variable...don't know...need to repeat it...then the debugger will catch it....

I've done several scans with and without rebuilder and never had that error....so it's not that common ;)
So...a full dump of your folder would be a start...plus instructions how to repeat it...
...or maybe it only happens when scanning zip files....or only rar ... or only 7z files....? Try to minimize the scope...
...and of course turn off virusscanners... ;)
Title: Re: Work In Progress
Post by: oxyandy on 23 August 2011, 02:13: Virus scanner off, yes.

Is not happening during any kind of actual processing.

It is happening After : when CMP shows the message "Looking for dat files" such as after a batch run,
however in the process of reducing the size of my CMP to share with you.
I found after deleting a dat (in this case from a profiler folder I have called MESS)
CMP then shows the message "Looking for dat files" and crashes.

This is the fastest way to repeat the crash.
This is happening 8 times of ten

It is not happening when I first open CMP, even though CMP shows the message "Looking for dat files"

It is not happening during a large batch operation, rebuild, match, remove then scan, then wait 1 sec, do next.
This all completes faultlessly, but then CMP after the batch run,
shows the message "Looking for dat files" and then crashes.

So when I say randomly crashing I never mean during a job.
So the type of archive 7z zip rar, has nothing to do with it.
I think with all the heavy testing I have done, these routines work really well.
(reading & writing archives, of any kind, no bugs here)
No bugs during the real 'work'...ok :)

I have emailed a link to my actual CMP folder 'as is'
To repeat the crash all you need to do is right click, delete an entry in the profiler, it will happen.
(here's hoping)
It wont matter that the paths etc don't exist, cause you don't even need to load a profile.

I am about to move over to another OS and try the folder 'as is' see if it repeats.
Will post results shortly.
Title: Re: Work In Progress
Post by: oxyandy on 23 August 2011, 02:27: Ok, really good news,
I extracted the archives contents here on this OS,
opened CMP, went to the MESS profile folder,
(however deleting ANY entry in any profiler folder should do the same crash)
delete a entry, crash
Repeated 5 times, every-time it crashes.
;)

I already think I know what you will say is (part of) the cause,
but I know if you can stop it,
it will only make CMP tougher.. he he

Remember it has already loaded and successfully processed all the dats
that are currently in the profiler, very well.
Title: Re: Work In Progress
Post by: Roman on 23 August 2011, 06:59: Sounds like an array index access issue...however I wonder why it crashes for you after a batch run (which doesn't remove any profiles)...and actually this should also happen on the last official, non-unicode build.....anyway...I will do tests later today...and I guess I should check both, 64 and 32 bit ;)
Title: Re: Work In Progress
Post by: oxyandy on 23 August 2011, 07:12: Just tell me from what I sent....
Just deleting a single entry is enough to reproduce the crash...

The solution and cause, well that can wait..
Title: Re: Work In Progress
Post by: Roman on 23 August 2011, 07:52: well, currently I'm at work and have to fiddle with JS/XSLT and soap requests ;) You have to wait approx 10 hours....
but this really sounds like an array access tries to get data from a previous (now invalid index due to deletion) and that leads to a crash....sounds pretty easy to fix as soon as I found the place...but I still wonder why this happens after a long batch job where no index is removed or added..anyway...I will check it...
Title: Re: Work In Progress
Post by: oxyandy on 23 August 2011, 18:53: http://www.upload.ee/image/1602829/relog.swf

Tick, Tock.

3.10am local time, 31 Celsius, 75% Humidity.
One last coffee, after having to replace bedroom air conditioner power socket.
Then it's time for a nap.
Title: Re: Work In Progress
Post by: Roman on 23 August 2011, 19:28: thanks for the files.....however...it does not crash here...
I've deleted several single profiles from your mess folder, nothing....I removed all...nothing....I removed the mess folder itself...nothing....
...and tried that several times...no issue :(
Title: Re: Work In Progress
Post by: oxyandy on 23 August 2011, 19:33: Oh, I had tried testing the crash on a different OS 'as is'

idea, one last thing to check..
quick reboot, I'll be back.
Title: Re: Work In Progress
Post by: Roman on 23 August 2011, 19:45: Coool......I got a crash
...but only under Windows XP (Virtual Machine)...not under Win7....

but now I should be able to fix that....
Title: Re: Work In Progress
Post by: oxyandy on 23 August 2011, 20:03: Alright this is another OS XP 64 SP2.
Still the same 32 cmp archive I sent.

it differs from the other 2 OS's I've tried this on,
cause the settings for Non-Unicode programs is English. (others this is Chinese)

I extracted the contents, and
tried Resetting some entries in the MESS folder, crashes still.
Tried deleting some, still crashes.

went to NonGood folder deleted an entry, crash again.
reset an entry in Nongood, again crash.

This is a very clean 'fresh' instal, I keep as an image.
It has no AV at all.
But also no malaware or any nasties, cause it's 'as new'
I don't have Win7 setup on these drives. So I can't try it... yet...

I'm not sure why I can repeat it so easy here and you cant.
Not sure what differences we have.
Maybe I could set-up some debugger, like visual studio has.. ?
Title: Re: Work In Progress
Post by: oxyandy on 23 August 2011, 20:07: Ok phew...
Our posts overlapped.. sounds good..
I'm not a 'nut' then :P
Hope you sort it out easy.
Title: Re: Work In Progress
Post by: Roman on 23 August 2011, 20:10: seems to be a weird one....since if it's a null pointer or wrong index it should crash everywhere...
wonder what the debugger will show....as soon as I've installed the development environment on the virtual machine....which takes some time...
Title: Re: Work In Progress
Post by: Roman on 23 August 2011, 22:04: argh...debug build on XP doesn't show any issue....smells like a compiler / optimizer issue...now that'll take some time to find that.....
Title: Re: Work In Progress
Post by: oxyandy on 24 August 2011, 02:16: Shame it's tricky.
I had hoped it would be easy for you.
So, on XP the crashes happen just like I describe here. ??
Win7, no crash..
Hmm
Title: Re: Work In Progress
Post by: Roman on 24 August 2011, 05:16: win7 no crash
winxp no crash when running debug or release from Visual Studio and on your files
winxp crash when running your exe on your files
winxp crash when running release compile on your files outside of Visual Studio

It seems to be related to a string class though...
Title: Re: Work In Progress
Post by: oxyandy on 24 August 2011, 06:22: Roman,
I hate to think you need to spend too much time on this crash, because the cause is the badly created dats..

I am really impressed that this new CMP can take dats 'as they are'..
Batch run them without any stops ( a first for me with this Unicode version )
Is usual painful hand editing before they will even load.

Like bad dats that use Unicode characters in the entries & author's name, yet are saved in ANSI.(sigh) The forum administrator of a group making these dats was informed much earlier this year, problem discussed, solutions suggested, yet nothing changed, cause the next dat release still contains the same exact badly formatted files.

If your program was only having to handle well made and formatted dats..
would it crash at this point at all ?
Title: Re: Work In Progress
Post by: Roman on 24 August 2011, 07:22: If the dats are structural no ok, cmpro won't load it...if it's just a bad character conversion it is loaded (however bad characters are shown then)...
I doubt that such dats cause a crash...
Title: Re: Work In Progress
Post by: Roman on 24 August 2011, 09:43: Looks like I've nailed it down...

never ever do pointer arithmetics when working with unicode :O)
Title: Re: Work In Progress
Post by: oxyandy on 24 August 2011, 10:10: Nice,
I went through the list, reset profiles, deleted
all from a fresh extract so it had everything jammed in there...
The crash is gone....

So it was a bug, you nailed it..

Now it seems invincible....

Is it final ?
We shall see..
I haven't finished trying to make it crash :D

Adv_AllowRestart = on

Let's take it for a test drive.
Title: Re: Work In Progress
Post by: Roman on 24 August 2011, 10:43: actually I'm not sure if it was an actual bug....

debugging the lines is fine, running them is fine (on Win7 always, on WinXP always when runned from the Visual Studio)...but a standalone release creates an issue with some Windows based libs......sounds more like a compiled code issue..

anyway, I rewrote the few lines and replaced some pointer arithmetics with basic function calls and it works....case closed.
Title: Re: Work In Progress
Post by: oxyandy on 24 August 2011, 11:04: Ok please do tell, it's great work,
how long did it take to resolve ?

It seems so solid now.
Not crashed
Title: Re: Work In Progress
Post by: Roman on 24 August 2011, 11:55: well...resolving was pretty fast...like 5 minutes..

I had an idea which part of the profile reader could be the cause for the fail after the debugger said something about the string class...
For the unicode build only that part had some weird LPTSTR casts here and there...so I rewrote that part and it worked...
Title: Re: Work In Progress
Post by: oxyandy on 24 August 2011, 12:41: That's a bit like me telling you I had an intermittent fault with my Plasma TV...
It only took me 1.5 minutes to fix (solder in a new part)
I wish... lol

I am done with 'testing', for now.
It's working really well now !

I may have a use for it later this evening.

I know what (bugs, glitches, errors, crashes) I found..
see.. being real careful on my word usage now.

I didn't test too hard after I found certain errors and reported them to you.
What did I miss, that others found ?
Title: Re: Work In Progress
Post by: Roman on 24 August 2011, 12:51: you missed minor stuff....clipboard wasn't fully fixed after I fixed your reported one for example...actually you reported most stuff.
Title: Re: Work In Progress
Post by: Roman on 24 August 2011, 19:06: heheh..actually another routine used the exact same pointer arithmetic...so ...another place to fix ;)