Feature #78

Dynamic client-side file deletion

Added by jbk over 1 year ago. Updated 8 months ago.

Status:Closed Start:07/12/2010
Priority:Normal Due date:
Assigned to:jbk % Done:

100%

Category:Core
Target version:- Estimated time:8.00 hours

Description

The BOINC clients seem to re-download the sticky files and only delete them when directly asked to do so.
Add support in the scheduler to delete files that are known not to have been in use for a while.

This will replace the current hardcoded mechanism.

delete_file.patch.tar (10 KB) jbk, 01/20/2011 17:32


Related issues

related to Renderfarming.net - BURP - Bug #80: Per-part input files do not make any sense Closed 08/07/2010

Associated revisions

Revision 1281
Added by jbk about 1 year ago

Added downstream patch from Olivier Romand from Renderfarm.fi related to deleting files that are no longer used on the clients, related to #78

Revision 1283
Added by jbk about 1 year ago

Change to use the queue_active_files table in the scheduler, related to #78

Revision 1335
Added by jbk 9 months ago

Use the correct field name when marking files active, related to #78

Revision 1356
Added by jbk 8 months ago

Added the ActiveFileQueueHandler which tracks live files and removes stale ones from the list of active files so that they can be deleted on the clients, fixes #78

Revision 1357
Added by jbk 8 months ago

SQL fixes related to #78

History

Updated by jbk about 1 year ago

Patch received from Olivier Romand on Renderfarm.fi

Updated by jbk about 1 year ago

Tested on BURP-main, seems to work great - I'll just quote myself for reference here:

A good start!

Actually the point about using a separate table for keeping track of active files is to keep files on the clients that are used frequently – rather than just those files which are in use right now. Obviously the currently active files are a subset of these files.

With the Sunflower release [snip] this will become pretty important since libraries will be able to be used across many sessions – even with some time between those sessions where the files are not actively used.

The way it will work: 1) When a session is accepted all its input files are added to the list of active files (if not already there) with a unix timestamp set to NOW. This table is the non-existant queue_active_files and has a file id and a timestamp. This table may need additional info to provide a quick-lookup index for the scheduler. 2) A (handler) daemon checks every hour for files in that table which are older than NOW-MAX_AGE and removes them. These are files which have not been referenced by any session in MAX_AGE seconds and can be safely removed from the client storage. The age should be stored in the Configuration class as “storage.maxAge”. 3) The scheduler, upon a connection from a client, checks the table table and requests the client to remove any files which are no longer in queue_active_files

This should ensure that popular libraries stay on the clients while infrequently used libraries and input files get purged within MAX_AGE (which would typically be 2 months or so).

(For now it would also make sense to remove the primary input file immediately once sessions complete, since they are not reused).

We tested the nasty case were a file would be mark as deleted even though some hosts are still computing: the file is requested to be deleted but won’t be until the end of the workunit.

Very nice, this is the kind of borderline tests that are very valuable to know about!

Updated by jbk about 1 year ago

  • Assigned to set to jbk
  • % Done changed from 0 to 30

Updated by jbk 12 months ago

  • % Done changed from 30 to 80

Now only requires new sessions to poke the files they use - and remove the files they KNOW that noone will use after they have completed.

Updated by jbk 11 months ago

This is related to #80 because the frame inputfile information is currently being made available during the preprocessing step but do not yet have per-frame files attached.

Updated by jbk 11 months ago

  • % Done changed from 80 to 90

Updated by jbk 8 months ago

This is almost done.

Updated by jbk 8 months ago

  • Status changed from New to Resolved
  • % Done changed from 90 to 100

Applied in changeset r1356.

Updated by jbk 8 months ago

  • Status changed from Resolved to Closed

Files are now removed from the active list by the ActiveFileQueueHandler when they are older than storage.maxActiveTimeLimit (in seconds)

Also available in: Atom PDF