View Issue Details

IDProjectCategoryView StatusLast Update
0006164Talerexchangepublic2021-08-24 16:23
Reporterfefe Assigned ToChristian Grothoff  
PrioritynormalSeverityminorReproducibilityN/A
Status closedResolutionfixed 
Product Version0.7.0 
Target Version0.7.1Fixed in Version0.7.1 
Summary0006164: deposit_cb return value (and code) are confusing
DescriptionThe comment above the callback says:

1569 * Function called with details about deposits that have been made,
1570 * with the goal of auditing the deposit's execution.

The code does this:

1617 if (GNUNET_DB_STATUS_SUCCESS_NO_RESULTS == qs)
1618 {
1619 report_row_inconsistency ("deposits",
1620 rowid,
1621 "denomination key not found");
1622 return GNUNET_OK;
1623 }

I wonder if this is the right return code. After all we are auditing existing, past transactions.
If they key is not in the database, then the transaction should not have made it this far...?

Next:
1646 struct TALER_DepositRequestPS dr = {
1647 .purpose.purpose = htonl (TALER_SIGNATURE_WALLET_COIN_DEPOSIT),
1648 .purpose.size = htonl (sizeof (dr)),
1649 .h_contract_terms = *h_contract_terms,
1650 .timestamp = GNUNET_TIME_absolute_hton (timestamp),
1651 .refund_deadline = GNUNET_TIME_absolute_hton (refund_deadline),
1652 .deposit_fee = issue->fee_deposit,
1653 .merchant = *merchant_pub,
1654 .coin_pub = *coin_pub
1655 };
1656
1657 if (GNUNET_OK !=
1658 TALER_JSON_merchant_wire_signature_hash (receiver_wire_account,
1659 &dr.h_wire))

This writes a hash code into dr.h_wire. However it is never explicitly referenced again...?

Turns out it is referenced here:

1680 if (GNUNET_OK !=
1681 GNUNET_CRYPTO_eddsa_verify (TALER_SIGNATURE_WALLET_COIN_DEPOSIT,
1682 &dr.purpose,
1683 &coin_sig->eddsa_signature,
1684 &coin_pub->eddsa_pub))

This is very confusing because dr.purpose is the first element in dr. We are really passing in dr, not dr.purpose. This function will take dr.purpose.size bytes from &dr.purpose (== &dr), and we initialized that here:

1648 .purpose.size = htonl (sizeof (dr)),

To get around these type mismatches, the verify functions liberally casts everything to void:

 850 unsigned char *m = (void *) validate;
 851 size_t mlen = ntohl (validate->size);
 852 unsigned char *s = (void *) sig;

I recommend rewriting the code to make clearer what it actually does.

At the very least I recommend adding a few explanatory comments.

Also note that the code relies on GNU compiler extensions to mark the struct as packed, in the hopes that the on-the-wire format will then stay the same across all platforms. That is a risky way to do this as you are tying yourself to a specific platform ABI. It may be better to not use the packed attribute and write custom serialization code.
TagsNo tags attached.

Activities

Christian Grothoff

2020-04-07 17:43

manager   ~0015558

Ok, let's start with line 1622. Yes, it is very, very strange/bad that the denomination key was not found. However, note that the foreign key constraints do not apply here: we have two databases, exchange and auditor. The denomination was not found in the auditor-DB. So basically, the exchange created another denomination and didn't tell the auditor about it, or the auditor somehow managed to forget it. So AFAIK this _can_ actually happen. Now, if we return #GNUNET_SYSERR, the overall transaction is aborted. If it was a SOFT error (serializability), it would be restarted. But it isn't, and clearly restarting wouldn't help. It's also not a HARD error where we cannot possibly continue to audit anything else and need help (like an amount overflow or internal invariant failure). Returning GNUNET_OK means to continue the audit, and we logged the (serious) error. So this looks fine to me.

Christian Grothoff

2020-04-07 17:44

manager   ~0015559

Line 1659: dr is used in line 1682, specifically the *entire* 'dr' struct, including dr.h_wire. Without this, the signature validation would fail.

Christian Grothoff

2020-04-07 17:51

manager   ~0015560

Oh, and the signing API is intentionally defined exactly this way: one MUST use packed structs that are identical on all platforms. And we have thousands of these structures in the code, custom serialization would require extra copying and TONS of extra logic, and so yes, we do intentionally rely on C compilers packing them as specified. We also do try hard to make them well-aligned in the first place, so the packing directives are technically sugar-coating. C is not exactly arbitrary in terms of packing rules, so theoretically correct compilers should still produce the same results. And those directives are well-supported by llvm as well. And yes, the signing API expects a purpose, but the purpose includes a 'size' field and that MUST match the actual size of the data structure. Note that the suffix "P" is used in Taler to indicate Packed structures, and PS for Packet-for-Signing.

I've fixed the (void*)-casting that Florian introduced and am now properly casting to "const void *" ;-).

Christian Grothoff

2020-04-07 17:57

manager   ~0015561

I've substantially increased the comment on the sign and verify functions to hopefully make their behavior more clear: 90d63f824..29bd17729

Christian Grothoff

2020-04-07 23:50

manager   ~0015565

Fefe: can we consider this 'resolved'?

fefe

2020-04-08 10:38

developer   ~0015572

You can consider my objections overruled if that helps :-)

I'm still not happy with the APIs. If you are going to verify a blob, pass in a pointer to the blob and a length.
And make it look like a pointer to the blob, not to a part of the struct that happens to be the first of the struct.

In the malloc case you argued against checking return values to keep the code readable, so it can be verified easily.
Well, this code is confusing and cannot be verified easily. I suggest applying the same line of argument here.

Note that I never suggested you did this by mistake. I always assumed it was by design. I just happen to disagree with your design.

fefe

2020-04-08 12:11

developer   ~0015574

Let me add one more concern:

1669 GNUNET_assert (GNUNET_OK ==
1670 TALER_amount_add (&total_bad_sig_loss,
1671 &total_bad_sig_loss,
1672 amount_with_fee));

Surely you don't want to abort the process if incoming data is wrong?
You want to log an error first, I would assume?

Christian Grothoff

2020-04-08 12:41

manager   ~0015575

I'm not trying to overrule your objections, I've tried to (a) address them by improving the documentation, and (b) explain what the reasoning is and why it works. Putting this on 'feedback' doesn't imply we are DONE.

Passing the length twice (in the struct and outside), maybe doable, but maybe not explicitly. These days I also actually usually dislike passing the purpose twice (to verify-routine), that's a historical artifact IMO.

So let's look at this constructively: What about wrapping the sign/verify functions with a macro that does a 'sizeof()' on the PS-struct, and checks that the sizeof() matches the purpose.size field PLUS that a == &a->purpose? That way, we could have basically exactly the same client-side code (except the .purpose would have to be removed), would gain an extra check on the size being 'correct' and also still enforce the struct _starts_ with the purpose. Would that be a style you would approve of?

Basically:

#define sign(ps,priv,sig) do {
   GNUNET_assert (sizeof(*ps) == ntohl (ps->purpose.size));
   GNUNET_assert ((void*) ps == (void*) &ps->purpose);
  legacy_sign(ps, priv, sig);
} while (0);

Verify analogous.

Christian Grothoff

2020-04-08 12:45

manager   ~0015576

On the "amount_add": Well, this is one of my bigger uncertainties. Right now, this code does result in a (not nice) auditor process crash. Game over. On "untrusted" input, so inherently 'bad'. It does "log" (GNUNET_assert() generates a log message to stderr), but not to the auditor-report (as one may expect). So I 100% can see why you go "wtf" on this one. It's a point where I _expected_ a discussion. Now, as per your e-mail, I feel this is not a discussion to have here or even via e-mail, as you clearly react allergic to even putting a bug on 'feedback', so let's table this one for the next Mumble.

fefe

2020-04-08 13:42

developer   ~0015577

Why verify that the value in purpose.size is correct if you know it? You could just put the right value there in the macro.
The less work the caller needs to get right, the better.

Also the second assertion can (and should) be a compile-time assertion.
See https://en.cppreference.com/w/c/language/_Static_assert for details.

Christian Grothoff

2020-04-08 13:45

manager   ~0015578

Ok, mini-Telco summary:
1) We'll change the auditor's amount assertions to 'special' logging (no JSON output, non-zero exit) with transaction abort logic (retry to re-audit those transactions), and document the possible failure and what to do in that case (fix corrupt database, then retry) in the auditor manual.

2) proposed macro for sign/verify considered a major improvement, to be implemented.

3) Fefe will try to file more, eh, atomic reports ;-).

Christian Grothoff

2020-04-08 13:48

manager   ~0015579

I don't want to put the value into the ps-struct because that way I can accept 'const' arguments. Also, it might be confusing for those who do not inspect the macro as to where that field is initialized. Side effects are also bad style. Copying the entire struct on the stack just so I can accept a 'const' value and have no side-effect and still initialize the length field seems worse.

Good point on the _static_assert. We'll need to define a macro for that, as we may not be universally on C11.

Christian Grothoff

2020-04-08 18:24

manager   ~0015588

Sign API fixes in:
gnunet.git: 17113fc2f..ba4da8c3f
exchange.git: b22ec757..50bc862a
merchant.git: e781a77..8eb0270
sync.git: b2efa7c..3e60624
anastasis.git: 529a3db..009658d

Christian Grothoff

2020-04-08 18:24

manager   ~0015589

Leaving bug open because of sub-point https://bugs.gnunet.org/view.php?id=6164#c15574
("bogus" asserts in audit logic).

Christian Grothoff

2020-04-08 23:57

manager   ~0015596

Bogus asserts are fixed in 1554cc31..84a40be0, just auditor documentation update is missing.

Christian Grothoff

2020-04-09 11:51

manager   ~0015603

docs.git, ff0d9a3..7a84fac, extends the auditor manual with documentation relating to the amount-arithmetic failure mode.

Issue History

Date Modified Username Field Change
2020-04-07 17:13 fefe New Issue
2020-04-07 17:13 fefe Status new => assigned
2020-04-07 17:13 fefe Assigned To => Christian Grothoff
2020-04-07 17:43 Christian Grothoff Note Added: 0015558
2020-04-07 17:44 Christian Grothoff Note Added: 0015559
2020-04-07 17:51 Christian Grothoff Note Added: 0015560
2020-04-07 17:51 Christian Grothoff Assigned To Christian Grothoff =>
2020-04-07 17:51 Christian Grothoff Status assigned => feedback
2020-04-07 17:57 Christian Grothoff Note Added: 0015561
2020-04-07 23:50 Christian Grothoff Note Added: 0015565
2020-04-08 10:38 fefe Note Added: 0015572
2020-04-08 10:38 fefe Status feedback => new
2020-04-08 12:11 fefe Note Added: 0015574
2020-04-08 12:41 Christian Grothoff Note Added: 0015575
2020-04-08 12:45 Christian Grothoff Note Added: 0015576
2020-04-08 13:42 fefe Note Added: 0015577
2020-04-08 13:45 Christian Grothoff Note Added: 0015578
2020-04-08 13:48 Christian Grothoff Note Added: 0015579
2020-04-08 14:01 Christian Grothoff Assigned To => Christian Grothoff
2020-04-08 14:01 Christian Grothoff Status new => assigned
2020-04-08 14:01 Christian Grothoff Target Version => 0.7.1
2020-04-08 18:24 Christian Grothoff Note Added: 0015588
2020-04-08 18:24 Christian Grothoff Note Added: 0015589
2020-04-08 23:57 Christian Grothoff Note Added: 0015596
2020-04-09 11:51 Christian Grothoff Note Added: 0015603
2020-04-09 11:51 Christian Grothoff Status assigned => resolved
2020-04-09 11:51 Christian Grothoff Resolution open => fixed
2020-04-09 11:51 Christian Grothoff Fixed in Version => 0.7.1
2021-08-24 16:23 Christian Grothoff Status resolved => closed