Download File from URL
There are a couple ways to do this. As mentioned, using the developer tools could work (more likely it will give you the url to the file) and right-clicking the link will work. Alternatively there are these options.
In Chrome
- Go to the URL
- Right-click the webpage
- Select Save As...
For verification purposes, here are png, jpg, and mp3 links. Follow them and try these steps. However, in my experience. If you already have a url to a file, opening up Chrome and following these steps is rather tedious so here is an alternative.
In Command Line
- Open your favorite terminal emulator
- type
curl -O URL
- Where
Ois written in capital - And
URLis the URL to the file, e.g.http://example.com/file.mp3
Download File from URL
There are a couple ways to do this. As mentioned, using the developer tools could work (more likely it will give you the url to the file) and right-clicking the link will work. Alternatively there are these options.
In Chrome
- Go to the URL
- Right-click the webpage
- Select Save As...
For verification purposes, here are png, jpg, and mp3 links. Follow them and try these steps. However, in my experience. If you already have a url to a file, opening up Chrome and following these steps is rather tedious so here is an alternative.
In Command Line
- Open your favorite terminal emulator
- type
curl -O URL
- Where
Ois written in capital - And
URLis the URL to the file, e.g.http://example.com/file.mp3
For Powershell, this example works great:
invoke-webrequest -uri http://files.animatedsuperheroes.com/themes/spiderman94.mp3 -outfile "c:\Spiderman94.mp3"
This was confirmed with Win10 x64 1607.
Videos
Intercept the link click.
Download the page as plain text with javascript and a CORS bypass.
Build a temporary new link and then click it.
https://jsfiddle.net/5ot02w4y/3/
$("a[download]").click(function(e) {
e.preventDefault();
$.ajax({
url: "https://cors-anywhere.herokuapp.com/" + $(this).attr("href"),
headers: {
"X-Requested-With": "true"
},
success: function(data) {
var a = $('<a></a>');
a.attr("href", window.URL.createObjectURL(new Blob([data], {
type: 'text/plain'
})));
a.attr("download", "page.html");
$("body").append(a);
a[0].click();
}
});
});
The download attribute only works for same-origin URLs or the blog and data schemes. The example you're trying would work if it was on the same website you're running it on. Try to change the URL to a link on the www.w3schools.com domain and you'll see it'll work.
In modern browsers that support HTML5, the following is possible:
<a href="link/to/your/download/file" download>Download link</a>
You also can use this:
<a href="link/to/your/download/file" download="filename">Download link</a>
This will allow you to change the name of the file actually being downloaded.
This answer is outdated. We now have the
downloadattribute. (see also this link to MDN)
If by "the download link" you mean a link to a file to download, use
<a href="http://example.com/files/myfile.pdf" target="_blank">Download</a>
the target=_blank will make a new browser window appear before the download starts. That window will usually be closed when the browser discovers that the resource is a file download.
Note that file types known to the browser (e.g. JPG or GIF images) will usually be opened within the browser.
You can try sending the right headers to force a download like outlined e.g. here. (server side scripting or access to the server settings is required for that.)
As explained in the comments, the download attribute only works for same origin URLs.
I solved this by exposing an endpoint in the backend that redirects to the url with the file I want to download, and pointing to that on the client side.
Frontend
<a href='/download-file' download>Download</a>
Backend
router.get('/download-file', (req, res, next) => {
res.redirect('http://example.com/url-to-the-file');
});
Beware of what you are redirecting to if you don't have control over those files.
In my case the file was hosted in a CDN, which made the origin of the URL different, but I still had control over it.
The <a> is for directing a user to a specified link. In html5 you may add the property download and link to a specific file to signify that you are intending to serve the file as a download.
In this case the href value should point to a specific file and not just a directory. The html file with the link must also be marked with a doctype for html5 at the top <!DOCTYPE html>.
You must serve the html/download from the same domain as your site or else deal with CORS which is another topic.
Example
<a href='http://example.com/myFileToDownload.html' download>Download Source</a>
Heya!
So I have a problem. I'd like to download episodes from the official Pokémon TV website, https://watch.pokemon.com Thing is, I want episodes in a specific language, let's say https://watch.pokemon.com/sv-se/#/season?id=pokemon-black-white. Now, I could just look at the HTML code by inspecting the source code and manually copy the URL for each episode, and download the episodes with youtube-dlp. But that is a lot of work, as I'd have to do that for each individual episode, and for each season I want. So what I want is to write a script (let's say, Python). I'd like to grab each link (href) from each <a> element of the class "btn-play" (or id "play_btn_reference").
The problem is that when I try to do this with beautifulsoup and requests, all I get is nothing; if you choose "View page source" in your browser while on the site, you'll see what I mean (the . I read on a StackOverflow post that selenium would help with scraping the HTML as-is, but I couldn't exactly get it to work.
This is what I have tried thus far:
from bs4 import BeautifulSoup from selenium import webdriver browser = webdriver.Chrome() url = "some_page" browser.get(url) html_source = browser.page_source soup = BeautifulSoup(html_source, 'lxml') browser.quit() fname="results.txt" with open(fname, "w", encoding="utf-8") as output: output.write(soup.prettify()) output.close()
Any of you fine chaps know how I could move forward? I'd just prefer to not have to manually download each episode, and though I have a little bit of experience with this, I don't know how to get past this hurdle. Any help is much appreciated!
Edit: poor formatting of the post
Hit Ctrl+S and save it as an HTML file (not MHTML). Then, in the <head> tag, add a <base href="http://downloaded_site's_address.com"> tag. For this webpage, for example, it would be <base href="http://stackoverflow.com">.
This makes sure that all relative links point back to where they're supposed to instead of to the folder you saved the HTML file in, so all of the resources (CSS, images, JavaScript, etc.) load correctly instead of leaving you with just HTML.
See MDN for more details on the <base> tag.
The HTML, CSS and JavaScript are sent to your computer when you ask for them on a HTTP protocol (for instance, when you enter a url on your browser), therefore, you have these parts and could replicate on your own pc or server. But if the website has a server-side code (databases, some type of authentication, etc), you will not have access to it, and therefore, won't be able to replicate on your own pc/server.