This is the mail archive of the ecos-maintainers@sourceware.org mailing list for the eCos project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: eCos 3.0 beta 1 punch list #2


Bart Veer wrote:
"Jifl" == Jonathan Larmour <jifl@eCosCentric.com> writes:


    >> So instead of making the change now, for all anoncvs targets
    >> and with no possiblity of testing on most of those targets, I
    >> want to do the work in eCosPro first. It can then be merged
    >> into a future anoncvs release, once I am confident that it is
    >> not going to break anything.

    Jifl> Why can it not be in both? Why should eCosPro be special
    Jifl> with respect to a changing API? And incompatible.

Switching the flash subsystem to initialize via static constructors
means that the h/w init order changes for every platform that uses
RedBoot, and very possibly some of the remaining platforms as well.
Currently flash initialization happens after cyg_start() has started
running.  All static constructors have already run. Virtual vectors
have been sorted out. The memory map has been sorted via
HAL_MEM_REAL_REGION_TOP() if necessary. Any bist() self-test code has
run. Some other RedBoot init code may have run as well.

Not always. For example if you build JFFS2 into RedBoot, as people now not infrequently do, cyg_flash_init gets called at priority CYG_INIT_IO. We have not had any problems with this reported by ecospro or anoncvs users. Certainly while it's possible to construct straw men that could cause problems, I think we need something rather more concrete as an argument against fixing the API.


Incidentally, you said the messiness was to do with initialising multiple flash devices, which wouldn't apply for most ports anyway as they will use the legacy device API which only supports a single device.

Switching to a static constructor means that the flash initialization
will now happen much earlier, before cyg_start(). Now, it is likely
that on most platforms that won't cause any problems - it worked fine
on the platform I tried. But we may easily have overlooked some
dependency somewhere which means that other platforms will stop
working, so that if anybody decides to rebuild and install RedBoot
after 3.0 they'll end up with a brick.

There are all sorts of factors with many pieces of code that have changed substantially: redboot (particularly its flash code) which is far from identical with eCosPro, device drivers, hal/common, HALs, v2 flash drivers, legacy mode in v2 io/flash (not tested by ecospro certainly), compiler differences, documentation which may have bit-rotted. This seems relatively minor compared with some of those. Your concern is the rationale for the (now de facto obsolete) ecos/images hierarchy - known good images. We can't promise anything otherwise - it's just best effort. I see nothing especially different here than changes in any of those other packages which could also easily cause targets to be bricked.


Rolling out this change in anoncvs means that it will affect every
single target in one fell swoop. Rolling it out in eCosPro means that
we get to test the new code one or a small number of ports at a time,
as we do new ports. Typically we'll make very sure that we have a way
of unbricking any board we are porting to, before we start messing
about with the flash. So if there are problems with the new code, we
can find them and fix them prior to any eCosPro release, with no risk
of affecting either our customers or anoncvs users. Over a period of
time the new code will be run on a variety of hardware with different
flash setups.

Meanwhile eCos 3.0 users are left with a deprecated obsolete API. How eCosCentric chooses to roll this out for their customers is not my concern at the moment. It's important to get the API right for a point zero release.


What I am proposing is pretty much the same strategy that was adopted
for the flash V2 work. That did not immediately go into the anoncvs
trunk because it was considered too dangerous a change.  Instead it
went into a separate branch, for those anoncvs users wanting the new
functionality, and it went into eCosPro.

That seems rather revisionist. It went into eCosPro because eCosCentric wanted the functionality, not because of some altruistic notion of testing things in advance for public eCos users. It was developed in a branch because it was developed progressively in it, rather than being a big-bang replacement per se.


Now, for most application developers this is a storm in a teacup. It
really does not matter all that much whether the flash
auto-initializes or whether they need to make an extra init call from
inside their application.

What matters is whether we care a jot about compatibility with documented APIs. If we are going to say that code written against published APIs may not work with eCos 3.0.1 or eCos 3.1 or whatever with the same semantics, then sure we should go ahead.


So, we have a change that offers very little benefit to users, which
significantly changes the order in which nearly all targets
initialize, which may cause some number of targets to break and leave
users with bricks, being put into the source tree very close to
release after not very much testing. The more I worked on this, the
more I realized this was a really bad idea.

That's what the time between beta and final is for. And after 3.0 we can look for a minor bug fix 3.0.1 release. Even if your paranoia turned out to be justified, we could just start shipping a 3.0.1.


The discussion on all this back in November involved an incompatible
API change, changing cyg_flash_init() so that it no longer took any
arguments. That meant we had to add new functions for setting the
printf function, e.g. cyg_flash_set_global_printf() and
cyg_flash_set_printf().

Sure, those names are fine too. I was in a rush this morning. Maybe the naming was even my idea before - haven't looked.


Now, there are no longer any plans to change the prototype of
cyg_flash_init().  It will continue to take a printf function, and act
as cyg_flash_set_global_printf(). However we'll probably want to mark
it as deprecated to make sure that people are aware things are done a
bit differently than before.

But without a cyg_flash_set_global_printf() people can't write code that will continue working in the same way from one release to the next (i.e. API breakage).


Since for 3.0 it will still be necessary for applications to call
cyg_flash_init(), and since cyg_flash_init() still takes a printf
pointer, there is very little point in implementing the set_printf()
functions right now.

You are conflating two problems - initialisation and the printf function. We can solve the printf problem separately from the initialisation one. For 3.0 we /could/ say that calling cyg_flash_init with a non-NULL argument is deprecated. The printf function should be set with the set_printf() function. With the initialisation fixed, cyg_flash_init could then be #defined to something like


#define cyg_flash_init(pf) do { if (pf) cyg_flash_set_global_printf(pf); } while (0)

leaving no overhead (with an optimising compiler) for those who called it with NULL. We could leave removing it till later. But I'd rather just remove it now and say that the printf function can only be set by set_(global)_printf and the argument to cyg_flash_init is ignored.

This would be sufficient to prevent API breakage.

But stepping further back, I'd rather just sort out the initialisation problem anyway and get it out of the way. I see the risks as small given the facts about initialisation dependencies. It's not like this is the final and won't be tested before release - we will be actively seeking testers and I bet we'll get them. I can give you an I Told You So card, which will oblige me to buy you a Wrestlers lunch if there is any bricking caused by the initialisation order.

It would only affect application code that
currently calls cyg_flash_init() multiple times with different
arguments, to change the printf function at run-time.
I suspect that very few applications do that.

Any program using JFFS2 already does this. The ability to change the printf function was specifically added by contributors.


BTW, you still haven't clarified what the messy bit is with initialising multiple devices.

Jifl
--
eCosCentric Limited      http://www.eCosCentric.com/     The eCos experts
Barnwell House, Barnwell Drive, Cambridge, UK.       Tel: +44 1223 245571
Registered in England and Wales: Reg No 4422071.
------["The best things in life aren't things."]------      Opinions==mine


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]