This is the mail archive of the cygwin@sourceware.cygnus.com mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

RE: problem in C++ pointer


> -----Original Message-----
> From: Jay Krell [mailto:jay.krell@cornell.edu]
> Sent: Sunday, March 12, 2000 9:09 AM
> To: swe sd
> Cc: cygwin@sourceware.cygnus.com
> Subject: Re: problem in C++ pointer
> 
> 
> The behavior at a different time and place of code with 
> undefined behavior
> is not entirely relevant. Different processors, different 
> command lines,
> different optimizations (possibly processor specific), 
> different versions,
> can lead to different definitions of undefined behavior. 
> Still, sometimes
> undefined behavior is predictable.
> 
> I do not yet know if this code is undefined, but certainly 
> the outcome of
> code like
> i = i++;
> i = ++i;
> i = i--;
> i = --i;
> printf("%d%d", i, ++i);
> printf("%d%d", --i, ++i);
> etc.
> 
> is. 

NO it isn't!... In "printf("%d%d", i, ++i);" and "printf("%d%d", --i, ++i);"
you do NOT know in which order the parameters of printf are evaluated; note
that this order can be, and often is, different between both calls to
printf. 

Thus if i == 5, "printf("%d%d", i, ++i);" can display either "56" or "66",
then i == 6, then "printf("%d%d", --i, ++i);" can display either "56" or
"76" and then i is still 6. 

Note that this is the same case if you wite "j = (--i)*(++i);"; you have no
clear Idea of what you'll get :-)...

In C and C++ there is only FOUR points where sequencing of operations is
guaranteed (and only as far as the final result is concerned, not memory
accesses in some cases):
1)  Between instructions (that is at each occurence of a semicolon or
closing brace)
2)  Between the evaluation of the left-hand-side (lvalue) of an "=", "." or
"->" operator and the actual access to the refered-to lvalue (assignement to
it, or access to a member of, including call of a member function)
3)  Between the evaluation of all the parameters of a function and its call,
and between the actual use of the returned value (store or use in an
expression) and any other code (that mean that calling the function and
using the result is implicitely atomic)
4)  At short-circuit operators (|| and &&) where the left expression is
evaluated before the  right one.
5)  At each comma operator; be be VERY careful: the comma operator is NOT
the same as the comma used to separate arguments of a function call. The
comma operator is separating two expresseions that are evaluated in order
(first the left one, then the right one), then the comma operator DISCARDS
the result of the left expression and yields the value of the right one.

Apart form these cases (and for the sub-expressions of a short-circuit or
comma operator) the order of evaluation of the components of an expression
is NOT defined and may change due to very obscure reasons known only by the
compiler (as the number of registers needed to evaluate an expression or
store its result, or the need to keep some intermediate non-changing value
for further use in a subsequent instruction).

> If it isn't clear if code falls into this pattern, that's probably
> enough reason to not write code like it. But even so, if it 
> is defined, it
> is a bug if Gcc doesn't consistently give it the standard definition.

The code pattern above IS undefined, thus can give you ANY result, even the
right one :-)

> 
> Here's a good imho boiled down example:
> class X
> {
> public:
>     X* F(int i) { printf("%d", i); return this;}
> };
> 
> int i = 0;
> X x, *px = &x;
> px->F(i++)->F(i++)->F(i++);
> 
> is the expected output 012 or is it undefined?

This is undefined; 

> Could it reasonably be 000, if none of the storages occur 
> till after all
> parameters are evaluated but before any functoins are called?

It could NOT be 000, as (form C++ point of view) i++ directly change the "i"
variable, be it in memory or elsewhere; so it could be 012, 021, 120, 102,
210, or 201 but nothing else (I think its enough :-))

> 
> according to the parts of the standard I can find and decipher:
> [expr] 5.0.2: "..uses of overloaded operators are transformed 
> into function
> calls as described in 13.5. Overloaded operators obey rules for syntax
> specifidd in this clause, but the requirements of operand 
> type, lvalue, and
> evaluation order are replaced by the rules for function call".
> 
> I couldn't find the evaluation order for function calls except
> 
> [intro.execution]1.8.17 "..There is also a sequence point 
> after the copying
> of a returned value and before the execution of any 
> expressions outside the
> function". I don't know if that "copying of a returned value" 
> applies to
> pointers and references, and I find "expressions outside the function"
> unclear.

This is needed in C++ as copying the result may need implicitely calling an
user defined cast operator or copy constructor and that avoiding incorrect
evaluation order dependencies could be very difficult.

> 
> Ordinarily, the sides of a -> I don't believe have a defined order of
> evaluation:

They have one although a quite trivial one: the left side of "->" have to be
evaluated BEFORE the call can be placed, but the ARGUMENTS of the call can
be evaluated BEFORE, AFTER, or INTERMIXED with the evaluation of the left
part of "->". 

> 
> eg:
> class C
> {
> public:
> int i;
> C(int i) { this->i = i ; }
> void F(C* c) { printf("%d%d", this->i, c->i); }
> };
> 
> C carray[2] = { C(0), C(1) };
> C* pc = carray;
> 
> (pc++)->F(pc++);
> 
> Is the output defined?

No, the order of the two "pc++" is undefined.

> It could "reasonably" be
> 01
> 10

Yes

> 00

I don't think so; this would mean parallelizing the two "p++", thus breaking
their implicit character of atomicity.
 
> ?
> 
> I'm assuming the order of eval of an unoverloaded -> is the same, if
> defined, as the order of eval of ".".

Yes.

> 
> "sequence point" is not in the index..
> 
>  - Jay
> 

Note that all this is quite intricated and was explicitely left undefined to
allow the optimizer to do its work. There is only one rule to follow: NEVER
modify the same variable twice in the same instruction.

If you NEED such a construct, introduce intermediate variables and remove
the offending order dependency. Note that IIRC gcc with "-Wall" will warn
you of these incorrect order dependencies (among lots of other dubious
constructs ;->).

Hope this helps,

		Bernard

--------------------------------------------
Bernard Dautrevaux
Microprocess Ingéniérie
97 bis, rue de Colombes
92400 COURBEVOIE
FRANCE
Tel:	+33 (0) 1 47 68 80 80
Fax:	+33 (0) 1 47 88 97 85
e-mail:	dautrevaux@microprocess.com
		b.dautrevaux@usa.net
-------------------------------------------- 

--
Want to unsubscribe from this list?
Send a message to cygwin-unsubscribe@sourceware.cygnus.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]