The simplest thing I can think of is to open the file you want to save as source code, which you can do from Chrome Android by prepending view-source to the URL, like so.
view-source:https://my-site.tld/xxx/my_page.html
You can then either copy and paste this code into a local file, or use the share button to open it with a compatible app of your choosing.
If this seems like too much of a pain, you might consider doing something like using PHP to determine whether the file should be viewed or downloaded. For example if the IP address of the request matches your phone, download instead of displaying the file.
<?php
/**
* File url: https://my-site.tld/xxx/my_page.php
*
*/
$remote_address = filter_input(INPUT_SERVER, 'REMOTE_ADDR', FILTER_SANITIZE_STRING);
if ($remote_address === '56.57.58.59') {
header("Content-Disposition: attachment; filename=\"filename.html\"\n");
header("Content-Type: text/html\n");
}
?>
<!-- Begin actual page content -->
<!DOCTYPE html>
<html lang="en-US">
<meta charset="UTF-8">
....
Conversely, you could also simply create a separate page to allow a download regardless of the requesting IP address.
<?php
/**
* File url: https://my-site.tld/xxx/my_page-download.php
*
*/
header("Content-Disposition: attachment; filename=\"filename.html\"\n");
header("Content-Type: text/html\n");
readfile('path/to/file');
Additional information about the Content-Disposition header may be found in the Mozilla Developer Docs. It's not the only resource, of course, it's just my personal favorite.
The simplest thing I can think of is to open the file you want to save as source code, which you can do from Chrome Android by prepending view-source to the URL, like so.
view-source:https://my-site.tld/xxx/my_page.html
You can then either copy and paste this code into a local file, or use the share button to open it with a compatible app of your choosing.
If this seems like too much of a pain, you might consider doing something like using PHP to determine whether the file should be viewed or downloaded. For example if the IP address of the request matches your phone, download instead of displaying the file.
<?php
/**
* File url: https://my-site.tld/xxx/my_page.php
*
*/
$remote_address = filter_input(INPUT_SERVER, 'REMOTE_ADDR', FILTER_SANITIZE_STRING);
if ($remote_address === '56.57.58.59') {
header("Content-Disposition: attachment; filename=\"filename.html\"\n");
header("Content-Type: text/html\n");
}
?>
<!-- Begin actual page content -->
<!DOCTYPE html>
<html lang="en-US">
<meta charset="UTF-8">
....
Conversely, you could also simply create a separate page to allow a download regardless of the requesting IP address.
<?php
/**
* File url: https://my-site.tld/xxx/my_page-download.php
*
*/
header("Content-Disposition: attachment; filename=\"filename.html\"\n");
header("Content-Type: text/html\n");
readfile('path/to/file');
Additional information about the Content-Disposition header may be found in the Mozilla Developer Docs. It's not the only resource, of course, it's just my personal favorite.
You can copy the html raw content (from view-source) into the textarea on this page
https://jsfiddle.net/o19q5tzy
and press download.
The webpage only contains
<a id="link" href="" download='tmp.html'>download</a><br>
<textarea oninput="link.href = window.URL.createObjectURL(new Blob([this.value], {type: 'plain/text'}))" style="width: 100ch; height: 40ch"></textarea>
Chrome recognizes all download links as html files - Google Chrome Community
Creating / downloading a .html file with a chrome extension
Is it possible to download a websites entire code, HTML, CSS and JavaScript files? - Stack Overflow
Can we download a webpage completely with chrome.downloads.download? (Google Chrome Extension)
How can I save a webpage as PDF instead of HTML in Google Chrome?
Can I save a password-protected webpage for offline access in Chrome?
Why can't I find the saved webpage file in my Downloads folder?
Videos
Hit Ctrl+S and save it as an HTML file (not MHTML). Then, in the <head> tag, add a <base href="http://downloaded_site's_address.com"> tag. For this webpage, for example, it would be <base href="http://stackoverflow.com">.
This makes sure that all relative links point back to where they're supposed to instead of to the folder you saved the HTML file in, so all of the resources (CSS, images, JavaScript, etc.) load correctly instead of leaving you with just HTML.
See MDN for more details on the <base> tag.
The HTML, CSS and JavaScript are sent to your computer when you ask for them on a HTTP protocol (for instance, when you enter a url on your browser), therefore, you have these parts and could replicate on your own pc or server. But if the website has a server-side code (databases, some type of authentication, etc), you will not have access to it, and therefore, won't be able to replicate on your own pc/server.
The downloads API downloads a single resource only. If you want to save a complete web page, then you can first open the web page, then export it as MHTML using chrome.pageCapture.saveAsMHTML, create a blob:-URL for the exported Blob using URL.createObjectURL and finally save this URL using the chrome.downloads.download API.
The pageCapture API requires a valid tabId. For instance:
// Create new tab, wait until it is loaded and save the page
chrome.tabs.create({
url: 'http://example.com'
}, function(tab) {
chrome.tabs.onUpdated.addListener(function func(tabId, changeInfo) {
if (tabId == tab.id && changeInfo.status == 'complete') {
chrome.tabs.onUpdated.removeListener(func);
savePage(tabId);
}
});
});
function savePage(tabId) {
chrome.pageCapture.saveAsMHTML({
tabId: tabId
}, function(blob) {
var url = URL.createObjectURL(blob);
// Optional: chrome.tabs.remove(tabId); // to close the tab
chrome.downloads.download({
url: url,
filename: 'whatever.mhtml'
});
});
}
To try out, put the previous code in background.js,
add the permissions to manifest.json (as shown below) and reload the extension. Then example.com will be opened, and the web page will be saved as a self-contained MHTML file.
{
"name": "Save full web page",
"version": "1",
"manifest_version": 2,
"background": {
"scripts": ["background.js"]
},
"permissions": [
"pageCapture",
"downloads"
]
}
No, it does not download for you all files: images, js, css etc. You should use tools like HTTRACK.
Download File from URL
There are a couple ways to do this. As mentioned, using the developer tools could work (more likely it will give you the url to the file) and right-clicking the link will work. Alternatively there are these options.
In Chrome
- Go to the URL
- Right-click the webpage
- Select Save As...
For verification purposes, here are png, jpg, and mp3 links. Follow them and try these steps. However, in my experience. If you already have a url to a file, opening up Chrome and following these steps is rather tedious so here is an alternative.
In Command Line
- Open your favorite terminal emulator
- type
curl -O URL
- Where
Ois written in capital - And
URLis the URL to the file, e.g.http://example.com/file.mp3
For Powershell, this example works great:
invoke-webrequest -uri http://files.animatedsuperheroes.com/themes/spiderman94.mp3 -outfile "c:\Spiderman94.mp3"
This was confirmed with Win10 x64 1607.
Heya!
So I have a problem. I'd like to download episodes from the official Pokémon TV website, https://watch.pokemon.com Thing is, I want episodes in a specific language, let's say https://watch.pokemon.com/sv-se/#/season?id=pokemon-black-white. Now, I could just look at the HTML code by inspecting the source code and manually copy the URL for each episode, and download the episodes with youtube-dlp. But that is a lot of work, as I'd have to do that for each individual episode, and for each season I want. So what I want is to write a script (let's say, Python). I'd like to grab each link (href) from each <a> element of the class "btn-play" (or id "play_btn_reference").
The problem is that when I try to do this with beautifulsoup and requests, all I get is nothing; if you choose "View page source" in your browser while on the site, you'll see what I mean (the . I read on a StackOverflow post that selenium would help with scraping the HTML as-is, but I couldn't exactly get it to work.
This is what I have tried thus far:
from bs4 import BeautifulSoup from selenium import webdriver browser = webdriver.Chrome() url = "some_page" browser.get(url) html_source = browser.page_source soup = BeautifulSoup(html_source, 'lxml') browser.quit() fname="results.txt" with open(fname, "w", encoding="utf-8") as output: output.write(soup.prettify()) output.close()
Any of you fine chaps know how I could move forward? I'd just prefer to not have to manually download each episode, and though I have a little bit of experience with this, I don't know how to get past this hurdle. Any help is much appreciated!
Edit: poor formatting of the post
Hi im a computer noob, im doing some online courses and everytime I download some slides in pdf, they go as an html document, I would like them to be a normal .pdf file. Anyone has a fix? A youtube video idk? Pls send help