deleteing web folders after saving web pages to disc

User avatar
ChrisGreaves
PlutoniumLounger
Posts: 15605
Joined: 24 Jan 2010, 23:23
Location: brings.slot.perky

deleteing web folders after saving web pages to disc

Post by ChrisGreaves »

I save web pages to my hard drive so that I may study them at leisure (hah!) over the next few days. Web pages nowadays are often accompanied by a folder of the same name, and the folder:-
(1) Duplicates many files when several pages from the one web site are saved to my hard drive
(2) Increases the probability of long FullNames exceeding limitations imposed by Windows
(3) Increases the search time as RoboCopy scans all folders on my hard drive for backup liability and
(4) Increases the copying time as RoboCopy scans all files on my hard drive for backup liability

In short, for someone like me who wants only the text of a web page, the associated folders are a waste of resources, both my time and the computer’s space.

As an experiment I am now deleting such folders the day after they are downloaded, that is, once they have been backed up the previous night.
I don’t care too much about the style sheet, text is text and I can always reformat it in Word2003 if I need to.
I do care if I lose an embedded image, but if I do this will be apparent on my next eyeballing of the page and I can recover the image from my backup.
I plan on running this experiment for a month or so, after which I am considering a procedure to delete web page sub-folders automatically as part of my back process.
If that works, I might advance the deletion to a time immediately preceding the backup process.
Worst case scenario is that you will find me next door at the Starbucks at 9pm desperately trying to relocate the web page I visited earlier in the day.

I’d be interested to hear comments.

I have not quantified the costs for and against retaining the web folders.
I just feel a tad offended by an apparent duplication and waste of time.
I recognise that I will not save much of my time unless and until my deletion process becaomes completely automatic.

Cheers
Chris
There's nothing heavier than an empty water bottle

User avatar
Rudi
gamma jay
Posts: 25455
Joined: 17 Mar 2010, 17:33
Location: Cape Town

Re: deleteing web folders after saving web pages to disc

Post by Rudi »

Hi Chris,

One has two options when saving web pages:
1. Save as Web Page, Complete (which is the default and produces the folder you refer to)
2. Save as Web Page, HTML Only (Which produces only a single file and saves only the HTML and not other images, styles, backgrounds etc)

Both these options appear in the Save As Type dropdown when saving a web page.
Maybe you should experiment with the HTML Only option top avoid all the other items.

BTW:
If you delete the folder that is produced by a complete web save, it automatically deletes the associated HTML file (as it is linked to the folder). The only way to break the link so you can delete one and not the other is to rename the HTML file or the folder. If the two do not share the same name, you can then delete one and the other will remain.
Another thing to note is that if you delete the folder (after renaming the HTML file), you end up with just the HTML --- exactly the same as if you saved as HTML Only.

Lastly; if you want to keep the format, images, text and formatting of the source web page, and keep it as a single file, and ensure links are active and will take you to the source web page if you have web access, then instead of saving as, use Print and select "Save to PDF" or even to Microsoft XPS (or to OneNote if you have it).
Regards,
Rudi

If your absence does not affect them, your presence didn't matter.

User avatar
ChrisGreaves
PlutoniumLounger
Posts: 15605
Joined: 24 Jan 2010, 23:23
Location: brings.slot.perky

Re: deleteing web folders after saving web pages to disc

Post by ChrisGreaves »

Rudi wrote:2. Save as Web Page, HTML Only (Which produces only a single file and saves only the HTML and not other images, styles, backgrounds etc)
Thanks Rudi; this works nicely.
Most of the time I need only the text.
Examples are Threads from Eileen's Lounge, Newspaper articles, Technical Papers describing some aspect of programming and so on.

Thanks :thankyou:

P.S. I don't suppose you know a handy shortcut key to implement "save As Web Page (HTML only)" do you?
I think I have spotted that Firefox 40.0.3 retains the previous save format, so that would keep me happy for now.
There's nothing heavier than an empty water bottle

User avatar
Rudi
gamma jay
Posts: 25455
Joined: 17 Mar 2010, 17:33
Location: Cape Town

Re: deleteing web folders after saving web pages to disc

Post by Rudi »

ChrisGreaves wrote:P.S. I don't suppose you know a handy shortcut key to implement "save As Web Page (HTML only)" do you?
See this webpage: http://helpdeskgeek.com/how-to/save-web ... -explorer/

Unfortunately it guides you through saving a Single Page *.mht format (which is the whole webpage in a single file (pictures, styles and all)).
It might work for you if you want access to the pictures, etc... (I could not locate how to modify the code to force a save as Only HTML).
Regards,
Rudi

If your absence does not affect them, your presence didn't matter.

User avatar
Leif
Administrator
Posts: 7209
Joined: 15 Jan 2010, 22:52
Location: Middle of England

Re: deleteing web folders after saving web pages to disc

Post by Leif »

Would ScrapBook :: Add-ons for Firefox make life any easier? (I haven't tried it myself.)
Leif

User avatar
ChrisGreaves
PlutoniumLounger
Posts: 15605
Joined: 24 Jan 2010, 23:23
Location: brings.slot.perky

Re: deleteing web folders after saving web pages to disc

Post by ChrisGreaves »

Leif wrote:Would ScrapBook :: Add-ons for Firefox make life any easier? (I haven't tried it myself.)
Thanks Leif. I installed it and saw the item in the menu bar. It has saved a web page in "Root" so i will have to go and root that out in a few minutes.
Root is not my default save drive for Firefox, that much I know!
:thankyou:
There's nothing heavier than an empty water bottle