DSM Archive: Difference between revisions
Jump to navigation
Jump to search
This is a project to attempt to archive as much info and data from decades of DSM community work |
|||
Line 6: | Line 6: | ||
==Archival methods== | ==Archival methods== | ||
===Quick, panicy, hacky | ===Quick, panicy, hacky method (for Apache Index pages)=== | ||
``` | |||
wget --execute="robots = off" --mirror --convert-links --no-parent --wait=5 http://rob.com/matt/dsm_logger/ | wget -N --execute="robots = off" --mirror --convert-links --no-parent --wait=5 http://rob.com/matt/dsm_logger/ | ||
``` | ``` | ||
* '''-N''' -Download only if file does not exist, or has a different size or timestamp (to recover from mid download failure) | |||
* '''--execute="robots = off"''' -Ignore robots.txt (tsk tsk) | |||
* '''--mirror''' -A mirroring operation | |||
* '''--convert-links''' -Change hyperlinks on Index pages to work locally | |||
* '''--no-parent''' -Don't mirror the parent directory ('./') | |||
* '''--wait=5''' -Wait five seconds between downloads as not to overwhelm the server | |||
===WARC Method (proper)=== | ===WARC Method (proper)=== | ||
* To be | * To be elaborated upon | ||
==DSM-ECU (Yahoo! Groups)== | ==DSM-ECU (Yahoo! Groups)== |
Revision as of 19:17, 16 February 2020
DSM Archive
This is a project to attempt to archive as much info and data from decades of DSM community work.
Archival methods
Quick, panicy, hacky method (for Apache Index pages)
``` wget -N --execute="robots = off" --mirror --convert-links --no-parent --wait=5 http://rob.com/matt/dsm_logger/ ```
- -N -Download only if file does not exist, or has a different size or timestamp (to recover from mid download failure)
- --execute="robots = off" -Ignore robots.txt (tsk tsk)
- --mirror -A mirroring operation
- --convert-links -Change hyperlinks on Index pages to work locally
- --no-parent -Don't mirror the parent directory ('./')
- --wait=5 -Wait five seconds between downloads as not to overwhelm the server
WARC Method (proper)
- To be elaborated upon
DSM-ECU (Yahoo! Groups)
Mailing list text
- Status: Found!
- Achieved by: ArchiveTeam
- Archived on: archive.org
- Link: https://archive.org/details/yahoo-groups-2017-04-05T22-38-56Z-c86b7b
- Web Archive: https://archive.org/download/yahoo-groups-2017-04-05T22-38-56Z-c86b7b/dsm-ecu.zRYyz8P.warc.gz
- CDX Index: https://archive.org/download/yahoo-groups-2017-04-05T22-38-56Z-c86b7b/dsm-ecu.zRYyz8P.cdx.gz
Files
- Status: NEEDED!