This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: workflow idiom to compare zip/tgz with folder subtree


On Sep 24, 2015, at 6:50 PM, Paul wrote:
> 
> I am shuttling incremental work back and
> forth between two locations using disc.

In that case, you want a distributed version control system (DVCS), not a centralized one.  That rules out Subversion.  (And CVS.)  Fossil and Git are DVCSes, so theyâll work a case like this.  Mercurial and Bazaar (a.k.a bzr) are also DVCSes, and both are also in the Cygwin package repo.

I donât know how to use any of the other three available DVCSes for a task like yours, but itâs certainly easy enough with Fossil.

The command flow looks like this, assuming the removable disk is called R:, and using âfâ as a short alias for âfossilâ:

   f new /cygdrive/r/shared-project.fossil
   cd ~/shared-project
   f open /cygdrive/r/shared-project.fossil
   f add *
   f ci -m 'initial checkinâ 

Now everything in ~/shared-project is copied into the Fossil repo on the R: volume.  When you get to the remote site:

   cd ~/shared-project
   f open /cygdrive/r/shared-project.fossil

Now you have a copy of all the files from the R: drive.  If you open a Fossil repo within an existing tree that previously wasnât under Fossil management, it will ask whether you want to overwrite the preexisting files or leave them alone.  If you leave them alone, a subsequent âfossil diffâ will show how your preexisting files differ from the ones in the Fossil repo on R:.

After you make changes to files at either site, say âfossil ciâ and it will open a text editor for you to describe your changes.   (Or use the -m option, as above.)

Then back at the other site:

   cd ~/shared-project
   f up

Now all your remote changes are synchronized.

If all that looks complicated, realize that there are only a few day-to-day commands: f ci, f up, f diff.

> the majority of the
> differences will not be relevant as the hierarcy exists at both sites.

If youâre saying that there are files that need to be semi-synchronized between the sites, so that only *some* changes to individual files need to be copied, then Fossil is probably going to fight you.

If youâre saying instead that some files in a given tree are syncâd and some arenât, thatâs easy.  Thatâs actually the normal way to use Fossil, since with software development projects, you typically only store original source files, and never store anything that can be re-generated from those sources.

(Some projects bend that rule a bit, storing both configure.ac and configure, for example.)

> Most of the files are not software, though parallels can be drawn:
> Long SQL scripts, Matlab scripts, images, data files, VBA, Matlab
> files, text files, LaTeX files, image files, and M$ Office files
> (Access, Excel, Word, Powerpoint, PST).

Most of those things are sensible to store in Fossil.

The main thing you want to avoid storing are large binary files whose content largely changes frequently.

Uncompressed image files (e.g. TIFF without compression) are fine, because probably only *parts* of the image change from one update to the next, so Fossil will store only the differences, then compress that difference, so that you effectively get TIFF-with-compression, and more efficiently than storing a series of separately-compressed TIFFs besides.

Compressed image files (e.g. PNG) can be okay, as long as they change rarely.  The problem with compressed images is that the compression algorithm can change every byte in the file just because a single pixel changed, so the whole image has to be stored in the Fossil repo again.

That said, your existing ZIP archival scheme may be re-copying unchanged images already, in which case Fossil will actually be more efficient, since versions where an image is unchanged refer back to the previously-stored copy of that image.

Besides TIFF, another image file format you might consider is PSD, which can be either compressed or uncompressed.  (Photoshop > Preferences > File Handling > Disable Compression of PSD and PSB Files.)  Plus, PSD layers ensure that only a changed layer needs to be stored separately, rather than the whole thing if you *do* use PSD compression.

MS Office docs are a similar problem to compressed PSD, since theyâre just specially-structured ZIP files.  Unchanged assets within, say, a PPTX file shouldnât be re-copied into the Fossil repo on checkin, but more data than would be stored if you could get an uncompressed PPTX file will still have to be stored.

By comparison, the LaTeX documents are wonderful for Fossil, since theyâre uncompressed text, so youâll get massive compression from them.  Not just the normal 2:1 you typically get for text, but potentially many times that because of the delta compression.

> This is not a development
> environment, it is an analysis environment (with code hackery to that
> end).  However, the evolution of files and version control
> requirements probably overlap

Yes, version control systems are good for more than just software source code.

> One differences from the days when I
> wrote "real" (compiled) code

SQL, VBA, and MatLab are real code.  Donât let anyone tell you different.

> As much as possible,
> everything should be quickly generatable from raw client input data
> files.

That strategy matches exactly with what you want for a VCS: store the source data, not the data generated from it, unless it just takes too much time and effort to re-generate it.

> using vim window
> splitting, it is very efficient to browse the diff output

While you can still do that with Fossil, itâs probably better to switch to either âfossil gdiffâ coupled with a graphical diff utility of your choice, or to use âfossil uiâ, which will let you view diffs of checked-in versions in a browser, either inline or side-by-side, your choice.

On Windows, fossil gdiff defaults to WinDiff, which you may already have installed, since MS distributes it with some other software:

  https://en.wikipedia.org/wiki/WinDiff

A lot of people prefer Beyond Compare or Meld, both of which can be configured to act as the graphical diff handler for Fossil.

It should be possible to use Vimâs vimdiff feature this way, too.

> try a few baby steps at some point.

One of the smarter things you can do with Fossil is to use several repositories, one for each focused project, instead of trying to store everything in a single âworldâ repo.

So, put one project under Fossil management today.  Sync it back and forth, work out the kinks.  Put another project under Fossil in a separate repository next week.  Add more repos as you become comfortable with the process.

Fossil makes managing multiple active repos easy with its âallâ command, which lets you do common things to all of the repositories.  âfossil all syncâ is a common incantation, for example, meaning âUpdate all the local Fossil checkouts with the changes from the master repos.â
--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]