I need help with wget and I hope to find it here. My goal ist to mirror a site (the soon to be replaced website at work) and I tried to do so with wget. I am totally able to mirror the pages and filter out the unwanted file types.

My problem is this: on several pages (e.g. /internal/forms) there are linked files like this /files/12345/form1.docx. So wget doesn’t save the file in the folder internal/forms but creates a new folder files/12345.

I understand why this happens but I really need the file in internal/forms and I can’t find a solution - is there any way to achieve that? Thank you so much for your help!

  • Chais
    link
    fedilink
    arrow-up
    2
    ·
    1 year ago

    I don’t think there’s a way to make the links visible to wget. You could maybe try creating symlinks for the files in their incorrect locations. But that involves creating a bunch of incorrect directories. Alternatively you can let wget do its thing and clean up afterwards.