View Issue Details

IDProjectCategoryView StatusLast Update
0002422GNUnetWin32 port (deprecated)public2019-12-25 12:27
ReporterLRN Assigned ToChristian Grothoff  
PrioritylowSeveritytweakReproducibilityN/A
Status closedResolutionwon't fix 
PlatformW32OSNTOS Version6.1.7601
Product VersionGit master 
Target Version0.12.1Fixed in Version0.12.0 
Summary0002422: Correct UTF-8 conversion when interacting with environment variables.
DescriptionI've made a sample program that reads a string from three different files (CP1251-, UTF8- and UTF16-encoded), then uses SetEnvironmentVariableA and SetEnvironmentVariableW to put this string into a variable (with different names), then uses GetEnvironmentVariableA/W on each of those variables and writes the result into a file.

In the following list "X->Y" means "Variable was set with SetEnvironmentVariableX and retrieved with GetEnvironmentVariableY". "OK" means that output is byte-identical to the input.

For CP1251-encoded text:
A->A - OK
A->W - OK (converted to wide-string using CP1251->UTF16)
W->A - garbage (0x3f3f61)
W->W - OK (with extra zero byte, which might be my fault, or may be due to the way string contents are interpreted by SetEnvironmentVariableW)

For UTF8-encoded text:
A->A - OK
A->W - misencoded (converted to wide-string using CP1251->UTF16; should look fine after UTF16->CP1251 conversion and re-interpreting as UTF8)
W->A - garbage (filled with 0x3F, which indicates a string converter failure)
W->W - OK

For UTF16-encoded text:
A->A - OK
A->W - misencoded (converted to wide-string using CP1251->UTF16; should look fine after UTF16->CP1251 conversion and re-interpreting as UTF16)
W->A - OK (CP1251-encoded)
W->W - OK


I think it is very likely that W32 stores environment variables in UTF16-encoded form internally (just as it does with filenames). Not lying about string encoding when setting a variable (as long as it's CP* matching your locale or UTF16) will result in the output of GetEvironmentVariable* being OK (with the exception of W->A for strings with characters not representable in CP* that your locale uses; this case was not covered by this test, since i've had hard time producing such a string). And it is possible to pass UTF8 through, as long as the reader of that variable knows to expect UTF8 encoding.

More to the topic: on W32 variables, like PATH, for example, should be read using wide-character-aware functions and converted to UTF8 as necessary (since GNUnet internally uses UTF8 almost exclusively these days). Also, on W32 variables should be set using wide-character-aware functions as well (after performing UTF8->UTF16 conversion for their arguments).
TagsNo tags attached.

Activities

schanzen

2019-12-20 13:19

administrator   ~0015199

Win32 support dropped in 0.11.7

Issue History

Date Modified Username Field Change
2012-06-13 13:05 LRN New Issue
2012-06-13 13:05 LRN Status new => assigned
2012-06-13 13:05 LRN Assigned To => LRN
2012-09-29 21:27 Christian Grothoff Severity minor => tweak
2018-06-07 01:19 Christian Grothoff Assigned To LRN =>
2018-06-07 01:19 Christian Grothoff Status assigned => confirmed
2019-12-20 13:19 schanzen Status confirmed => resolved
2019-12-20 13:19 schanzen Resolution open => won't fix
2019-12-20 13:19 schanzen Target Version => 0.12.1
2019-12-20 13:19 schanzen Note Added: 0015199
2019-12-25 12:27 Christian Grothoff Assigned To => Christian Grothoff
2019-12-25 12:27 Christian Grothoff Status resolved => closed
2019-12-25 12:27 Christian Grothoff Fixed in Version => 0.12.0
2024-01-12 14:28 schanzen Category Win32 port => Win32 port (deprecated)