-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Properly handle UTF8 and long-paths on Windows #189
Comments
@animetosho any chance we can get you interested in fixing this one? |
If someone submits a pull request to par2cmdline, I can consider merging it into par2cmdline-turbo. |
I intend to submit a PR for this; I've got UTF8 working including in the console it just needs tidying up. I haven't really looked at the long paths issue yet, but from a quick glance I think there are numerous issues;
|
@mnightingale Any updates on this work? |
Sorry I have been busy with other things, the Unicode problem is much involved than expected, needs a lot of MultiByteToWideChar and convert the other way for outputting to the code page in use the console. I have some changes that work, but it is very messy and would break non-Windows builds. The long path problem appears a simpler fix at least for verify, create needs more work on how it splits a path into components and the MAX_PATH it used for buffers. |
Is there anything we can learn from nzbget handling of this? |
Looks like their might be interest from their to fix it (mentioned issue), they certainly have more experienced C++ programmers than me 😄 But yes nzbget handles some of what needs implementing here, see helpers How MultiPar handles it may also be useful to someone else looking https://github.com/Yutaka-Sawada/MultiPar/blob/d733ada21ae81405de468dd2cd458bcbf09ab9ea/source/par2j/common2.c#L66-L204 specifically the parts for converting to the code page used by the console and handling the possible errors (try again with dwFlags = 0). edit: just to mention the other part which nzbget doesn't have to worry about is every filename including from par packets, needs preparing for the console when printing log/debug lines, so you don't get "?" or broken characters. |
@Safihre I've cleaned up what I'd started working on https://github.com/mnightingale/par2cmdline/tree/unicode_filenames it won't compile outside of Windows currently and it only attempts to addresses the problems with UTF8 filenames. I haven't made a draft PR for now as I don't want to put someone else having a go. edit: main thing is figuring out how to remove TCHAR, at the time just wanted to get it to compile. |
I think it's about there now, I've added support for
Changes can be applied applied with minimal changes to par2cmdline-turbo |
@mnightingale We only have to care for Windows, on the other platforms there are no problems. The SABnzbd unit-tests already show that :) |
Test file has:
But generating new recovery volumns with par2cmdline uses unicode, against the spec but I don't think it's the only implementation to do as such.
I think I'll just remove the part trying to handle this, then it will behave the same broken way on all platforms. |
As requested by @animetosho, reporting the original bug here 💯
On Windows, par2cmdline can basically only handle ASCII filenames. This was fine when par2cmdline was created long ago, but the world is UTF8 now.
Simply try to open a file named
你好世界.par2
, will fail.This was fixed long ago in
par2-tbb
, but that fork has been unmaintained for a long long time.It also displays any UTF8-charahters as question marks in the console output.
Additionally, par2cmdline does not seem to really support long-path notation on Windows. Which is needed to open files where the path exceeds 255 characters.
For example
\\\\?\\C:\\test_win_unicode\\test_win_unicode.par2
results also in:We use that by default so we can avoid any problems on longer paths and having to check path-lengths every time.
animetosho/issues/13
animetosho/issues/17
sabnzbd/sabnzbd/pull/2674
The text was updated successfully, but these errors were encountered: