View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0002422 | GNUnet | Win32 port (deprecated) | public | 2012-06-13 13:05 | 2019-12-25 12:27 |
Reporter | LRN | Assigned To | Christian Grothoff | ||
Priority | low | Severity | tweak | Reproducibility | N/A |
Status | closed | Resolution | won't fix | ||
Platform | W32 | OS | NT | OS Version | 6.1.7601 |
Product Version | Git master | ||||
Target Version | 0.12.1 | Fixed in Version | 0.12.0 | ||
Summary | 0002422: Correct UTF-8 conversion when interacting with environment variables. | ||||
Description | I've made a sample program that reads a string from three different files (CP1251-, UTF8- and UTF16-encoded), then uses SetEnvironmentVariableA and SetEnvironmentVariableW to put this string into a variable (with different names), then uses GetEnvironmentVariableA/W on each of those variables and writes the result into a file. In the following list "X->Y" means "Variable was set with SetEnvironmentVariableX and retrieved with GetEnvironmentVariableY". "OK" means that output is byte-identical to the input. For CP1251-encoded text: A->A - OK A->W - OK (converted to wide-string using CP1251->UTF16) W->A - garbage (0x3f3f61) W->W - OK (with extra zero byte, which might be my fault, or may be due to the way string contents are interpreted by SetEnvironmentVariableW) For UTF8-encoded text: A->A - OK A->W - misencoded (converted to wide-string using CP1251->UTF16; should look fine after UTF16->CP1251 conversion and re-interpreting as UTF8) W->A - garbage (filled with 0x3F, which indicates a string converter failure) W->W - OK For UTF16-encoded text: A->A - OK A->W - misencoded (converted to wide-string using CP1251->UTF16; should look fine after UTF16->CP1251 conversion and re-interpreting as UTF16) W->A - OK (CP1251-encoded) W->W - OK I think it is very likely that W32 stores environment variables in UTF16-encoded form internally (just as it does with filenames). Not lying about string encoding when setting a variable (as long as it's CP* matching your locale or UTF16) will result in the output of GetEvironmentVariable* being OK (with the exception of W->A for strings with characters not representable in CP* that your locale uses; this case was not covered by this test, since i've had hard time producing such a string). And it is possible to pass UTF8 through, as long as the reader of that variable knows to expect UTF8 encoding. More to the topic: on W32 variables, like PATH, for example, should be read using wide-character-aware functions and converted to UTF8 as necessary (since GNUnet internally uses UTF8 almost exclusively these days). Also, on W32 variables should be set using wide-character-aware functions as well (after performing UTF8->UTF16 conversion for their arguments). | ||||
Tags | No tags attached. | ||||
Date Modified | Username | Field | Change |
---|---|---|---|
2012-06-13 13:05 | LRN | New Issue | |
2012-06-13 13:05 | LRN | Status | new => assigned |
2012-06-13 13:05 | LRN | Assigned To | => LRN |
2012-09-29 21:27 | Christian Grothoff | Severity | minor => tweak |
2018-06-07 01:19 | Christian Grothoff | Assigned To | LRN => |
2018-06-07 01:19 | Christian Grothoff | Status | assigned => confirmed |
2019-12-20 13:19 | schanzen | Status | confirmed => resolved |
2019-12-20 13:19 | schanzen | Resolution | open => won't fix |
2019-12-20 13:19 | schanzen | Target Version | => 0.12.1 |
2019-12-20 13:19 | schanzen | Note Added: 0015199 | |
2019-12-25 12:27 | Christian Grothoff | Assigned To | => Christian Grothoff |
2019-12-25 12:27 | Christian Grothoff | Status | resolved => closed |
2019-12-25 12:27 | Christian Grothoff | Fixed in Version | => 0.12.0 |
2024-01-12 14:28 | schanzen | Category | Win32 port => Win32 port (deprecated) |